US

Marios  Pattichis

US

US011076153B2 ( 12) United States Patent ( 10 ) Patent No .: US 11,076,153 B2 (45 ) Date of Patent : Jul . 27 , 2021 Pattichis et al . ( 54 ) SYSTEM AND METHODS FOR JOINT AND ADAPTIVE CONTROL OF RATE , QUALITY , AND COMPUTATIONAL COMPLEXITY FOR VIDEO CODING AND VIDEO DELIVERY ( 71 ) Applicant: STC.UNM , Albuquerque, NM (US) ( 72 ) Inventors: Marios Stephanou Pattichis , Albuquerque, NM ( US ) ; Yuebing Jiang , Santa Clara, CA (US ) ; Cong Zong , Albuquerque, NM ( US ) ; Gangadharan Esakki, Albuquerque, NM (US ) ; Venkatesh Jatla , Albuquerque, NM (US ) ; Andreas Panayides, Strovolos ( CY) ( * ) Notice: Subject to any disclaimer, the term of this patent is extended or adjusted under 35 U.S.C. 154 ( b ) by 0 days. ( 21 ) Appl . No .: 15 /747,982 ( 22 ) PCT Filed : Jul . 31 , 2016 ( 86) PCT No .: PCT/US2016/ 044942 $ 371 (c ) ( 1 ) , ( 2 ) Date: 31 , 2015 . ( 2014.01 ) ( 2014.01 ) ( Continued ) (56) References Cited U.S. PATENT DOCUMENTS 6,426,772 B1 * 7/2002 Yoneyama 8,798,137 B2 * 8/2014 Po (Continued ) HO4N 19/61 375 /240.02 HO4N 19/176 375 /240.02 ( Continued ) System and methods for the joint control of reconstructed video quality, computational complexity and compression rate for intra -mode and inter -mode video encoding in HEVC . The invention provides effective methods for (i ) generating a Pareto front for intra -coding by varying CTU parameters and the QP, ( ii ) generating a Pareto front for inter -coding by varying GOP configurations and the QP, ( iii ) real- time and offline Pareto model front estimation using regression methods, ( iv ) determining the optimal encoding configurations based on the Pareto model by root finding and (Continued ) 100 102 www.woman DRASTIC Controller 102A Time 102B Rate . 102C - Quality 108 Split Tran . Quant. CU CTU Frame Decoded HO4N 19/149 ; HO4N 19/119 ; HO4N 19/172 ; HO4N 19/147 ( Continued ) Primary Examiner Tung T Vo (74 ) Attorney, Agent, or Firm Valauskas Corder LLC ABSTRACT ( 57 ) Related U.S. Application Data ( 60 ) Provisional application No. 62 / 199,438 , filed on Jul . Input (Continued ) ( 58 ) Field of Classification Search CPC .. GO6F 15/177 ; HO4N 19/127 ; H04N 19/126 ; Marios Pattichis . * Jan. 26 , 2018 Prior Publication Data US 2018/0220133 A1 Aug. 2 , 2018 H04N 19/127 H04N 19/172 ( 2014.11 ) ; H04N 19/126 ( 2014.11 ) ; Dynamically Reconfigurable Architecture System for Time-varying Image Constraints (DRASTIC ) for HEVC Intra Encoding, 2013 , IEEE , pp . 1112-1116 , by Yuebing Jiang, Gangadharan Esakki, and PCT Pub . Date : Feb. 9 , 2017 (51 ) Int. Cl. H04N 19/127 ( 2014.11 ) ; H04N 19/119 OTHER PUBLICATIONS ( 87 ) PCT Pub . No .: WO2017/023829 ( 65 ) (52) U.S. CI. CPC Split 106 104 Picture Buffer SAO Entropy coding inv. Quant . iny. Tran . Prediction DBF Intra Recon . Picture Buffer US 11,076,153 B2 Page 2 local search, and ( v ) robust adaptation of the constraints and model updates at both the CTU and GOP levels. 18 Claims , 14 Drawing Sheets ( 51 ) Int . Cl. H04N 19/149 H04N 19/119 H04N 19/126 4/2015 Saxena 2015/0172661 A1 * 6/2015 Dong 2015/0215631 A1 * 7/2015 Zhou 2015/0271531 A1 * 9/2015 Wen 2015/0326883 A1 * 11/2015 Rosewarne ( 2014.01 ) ( 2014.01 ) ( 2014.01 ) ( 2014.01 ) 2015/0373328 A1 * 12/2015 Yenneti H04N 19/147 (52) U.S. CI. CPC H04N 19/147 ( 2014.11 ) ; H04N 19/149 (2014.11 ) ; H04N 19/172 (2014.11 ) ( 58 ) Field of Classification Search USPC ....... 375 / 240.15 , 240.03 See application file for complete search history. 2016/0050422 A1 * 2/2016 Rosewarne 2016/0088298 Al * 3/2016 Zhang 2016/0094855 A1 * 3/2016 Zhou 2016/0127733 A1 * 5/2016 Wan References Cited 6/2016 Ugur 2016/0173875 A1 * 6/2016 Zhang U.S. PATENT DOCUMENTS 2016/0295217 A1 * 10/2016 Suzuki ( 56 ) B2 * 1/2015 Yang 8/2015 Pattichis et al . B2 1/2017 Pattichis et al . B2 A1 * 6/2006 Chang HO4N 19/172 2009/0175330 A1 * 7/2009 Chen HO4N 19/115 2011/0164677 A1 * 7/2011 Lu 375 / 240.01 HO4N 19/176 2011/0235928 A1 * 9/2011 Strom HO4N 19/115 382/233 8,934,538 9,111,059 9,542,198 2006/0133480 2015/0110181 A1 * HO4N 19/15 375 / 240.02 2016/0314603 2016/0316215 2017/0013261 2018/0184089 375 / 240.02 H04B 7/15592 2012/0287987 A1 * 11/2012 Budagavi 2013/0094565 A1 * 4/2013 Yang 2013/0129241 A1 * 5/2013 Wang HO4N 19/587 370/246 375 / 240.02 HO4N 19/105 375 / 240.02 HO4N 19/19 382/233 HO4N 19/172 375 / 240.03 2014/0016693 A1 * 1/2014 Zhang 2014/0161177 Al * 6/2014 Sim 2014/0192862 A1 * 7/2014 Flynn 2015/0030068 A1 * 1/2015 Sato 2015/0049805 A1 * 2/2015 Zhou HO4N 19/176 2015/0092840 A1 * 4/2015 Mochizuki HO4N 19/593 2015/0103892 A1 * 4/2015 Zhou HO4N 19/117 375 / 240.03 HO4N 21/23439 375 / 240.02 HO4N 19/70 375 / 240.03 H04N 19/117 375 / 240.03 HO4N 19/70 A1 10/2016 Carranza et al . A1 * 10/2016 Minoo A1 * 1/2017 Lin A1 * 6/2018 Zhang HO4N 19/52 375 /240.16 HO4N 19/119 375 /240.18 HO4N 19/12 375 /240.03 HO4N 19/105 375 /240.12 HO4N 19/146 375 /240.03 HO4N 19/176 375 /240.12 HO4N 19/127 375 /240.02 HO4N 19/33 375 /240.08 HO4N 19/436 375 /240.03 H04N 19/70 HO4N 19/132 H04N 19/176 HO4N 19/146 OTHER PUBLICATIONS 375 / 240.03 2011/0305144 A1 * 12/2011 Sethakaset 2013/0272383 A1 * 10/2013 Xu 2016/0156917 A1 * HO4N 19/176 375 /240.12 HO4N 19/51 375 /240.03 HO4N 19/436 375 /240.02 Overview of the High Efficiency Video Coding ( HEVC ) Standard , Gary J. Sullivan , Fellow , IEEE , Jens -Rainer Ohm , Member, IEEE , Woo Jin Han , Member, IEEE , and Thomas Wiegand, Fellow , IEEE , IEEE Transactions on Circuits and Systems for Video Technology, vol . 22 , No. 12 , Dec. 2012. * Dynamic Switching of GOP Configurations in High Efficiency Video Coding ( HEVC ) using Relational Databases for Multi objective Optimization, Gangadharan Esakki, Sep. 12 , 2014. * Gangadharan Esakki et al . , “ Dynamic Switching of GOP Configu rations in High Efficiency Video Coding ( HEVC ) using Relational Databases for Multi - objective Optimization . ” The University of New Mexico , 2014. website http://digitalrepository.unm.edu/ece_ etds/ 80 . Jiang et al., “ Dynamically reconfigurable architecture system for time- varying image constraints ( drastic ) for hevc intra encoding ” , Asilomar Conference on Signals Systems and Computers, pp . 1112-1116 , Nov. 2013 . Jiang et al . , “ Dynamically reconfigurable DCT architectures based on bitrate power and image quality considerations ” , 19th IEEE International Conference on Image Processing ( ICIP) , pp . 2465 375 / 240.03 2468 , 2012 . 375 / 240.03 Jiang et al., “ Dynamically reconfigurable architecture system for time -varying image constraints ( drastic ) for motion ipeg" ,J Real 375 / 240.03 Time Image Proc ( 2018 ) 14 : 395. https://doi.org/10.1007/s11554 014-0460-8 . * cited by examiner U.S. Patent Jul . 27 , 2021 100 102 southwest 102A Time 102B US 11,076,153 B2 Sheet 1 of 14 DRASTIC Controller Rate 102C - Quality Input CTU Frame Decoded 108 Split Tran . Quant. CU Split 106 104 SAO coding inv.Quant. inv. Tran , Intra Picture Buffer reco Entropy TU Prediction Intra Recon . Picture Buffer DB FIG . 1 Cu Size 64 32 proc . O id 0-0 "A " 2 16 OO 8 22 region 1 ma " B" 5-20 ? 25 region 2 FIG . 2 84 28 O 21-84 (212) 85-212 U.S. Patent Jul . 27 , 2021 US 11,076,153 B2 Sheet 2 of 14 ****** ? ?????? ? A. . ti WWIR ??? ?,? itii ??? ? ???? ? ?? timelns)Dern2000 bits perper FIG . 3 + psnr1 bits 1 time1 psnr3 bits3 time3 FIG . 4 psnr2 bits2 time2 U.S. Patent Jul . 27 , 2021 Sheet 3 of 14 US 11,076,153 B2 1 : Estimate budgets for T , Q , R for all CTUs. 2 : Estimate QP and Config using initial model . Encode frame by iterating through the CTUS. 3 : for each CTU in current frame do Robust allocation T , Q , R within available budgets . Allocate T.Q. R based on available budgets . Update remaining budgets for T , Q , R. 7: if any remaining budget < 0 then > Adjust budget to minimize the violation . ReAllocate CTU budgets using a fraction 8: end if of the remaining total frame budget. Robust model update Update model using three neighboring CTUS. 11 : 12 : if model update failed then Update model with neighboring CTU model that gave best prediction . end if cont ... FIG . 5A U.S. Patent Jul . 27 , 2021 Sheet 4 of 14 US 11,076,153 B2 cont ... 13 : 16 : Robust parameter estimation and optimization , Estimate QP and Config based on the model . Solve optimization problem using local search . if either QP or Config is out of range then >> Update constraints and fix encodings Update constraints and estimate new estimates of QP and Config . Constrain QP to be within ++ of neighboring CTUS. Enforce QP and Config within valid ranges . end if 20: > Encode CTU and store encoding parameters . Encode CTU using QP and Config . Compute T. Q , R for current CTU . Save QP . Config , T. Q , R and CTU location for model updates. 23 : end for FIG . 5B U.S. Patent Jul . 27 , 2021 US 11,076,153 B2 Sheet 5 of 14 Appended Appended Appended Appended Appended Appended Appended CTU CTU CTU CTU CTU CTU CTU ? Appended CTU Appended CTU Appended CTU Appended CTU Appended CTU Appended CTU FIG . 6 psnr2 1 bits2 time2 psnr3 bits3 time3 FIG . 7 ? U.S. Patent Sheet 6 of 14 Jul . 27 , 2021 Mode Minimum Rate US 11,076,153 B2 Objective (minimum ) norm ( abs (MSEest - MSEtarget )) norm (abs ( Timeest Timetarget ) Minimum Complexity norm (abs (MSEest MSEtarget ) Maximum Quality * -norm (abs (BPSest - BPStarget norm ( abs (Timeest - Timetarget )) + norm ( abs (BPSest -BPStarget ) FIG . 8 Use CTU SSE and times T to estimate a.b. 1 : if ( SSEtop ! = SSEleft ) and ( Ttop ! = Tleft ) then 2: 3: b = log( Ttop / Tleftb ) / log ( SSEtop /SSEleft) a = Ttop / SSEtop 4 : end if FIG . 9 U.S. Patent A Jul . 27 , 2021 Sheet 7 of 14 US 11,076,153 B2 Estimate ratios associated with current CTU . 1 : Tused + T /Ttarget,i 2: SSEused + SSESSEtarget.i 3 : if ( SSEused > 1 ) and ( Tused > 1 ) then Above the target. if ( SSEused < Tused ) then Reduce time to meet the curve . 6: 7 Ttarget, i = a · Qtarget else o Reduce SSE to meet the curve . Qtarget = ( Ttarget / a )1/6 8: end if 9 ; else Below the target . if (SSEysed > Tused ) then Increase time to meet the curve . 12 : 13 : Ttarget = (Qtarget /a ) 1/6 else Increase SSE to meet the curve . Qtarget = a . Ttarget end if 15 : end if FIG . 10 U.S. Patent Jul . 27 , 2021 Sheet 8 of 14 US 11,076,153 B2 D Use CTU SSE and bitrates R to estimate a , b . 1 : if ( SSEtop ! = SSEleft ) and (Rtop ! = Rleft ) then 2: 3: b = log ( SSEtop / SSEleft ) / log (Rtop /Rieft ) a = SSEtop / Rtop 4 : end if FIG . 11 1 : Rused + R / Rtarget.i 2 : SSEused + SSE /SSEtarget, i 3 : if (Rused > 1 ) and ( SSEused > 1 ) then 4: 5: 6: 7: 8: if ( SSEused < Rused ) then Rtarget (Qtarget / a )1/6 else Qtarget = a · Rtarget ??? end if 9 : else 10 : if (SSEused > Rused ) then 12 : else Rtarget (Atarget / a )1/6 Qtarget = a · Rtarget" end if 15 : end if FIG . 12 U.S. Patent Jul . 27 , 2021 Sheet 9 of 14 US 11,076,153 B2 1 : D Use CTU encoding times T and rates R 2 : D to estimate a , b for the model . 3 : if ( Ttop ! = Tleft ) ) and ( Rtop ! = Rleft ) then b = log ( Ttop /bTieft ) / log (Rtop /Rleft) 6 : end if Ttop /Rtop FIG . 13 1 : Tused for T / Ttarget,i 2: Rysed + R /Rtarget ,i 3: if (Rused > 1 ) and ( Tused > 1 ) then if ( Tused < Rused ) then Ttarget a · Ritarget 6: else 8: end if 7: Rtarget = ( Ttarget / a )1 / 9 : else 11 : kamera 13 : if ( Tused > Rused ) then Rtarget = ( Ttarget /a )1/6 else Ttarget a . Rtarget end if 15: end if penis FIG . 14 U.S. Patent Jul . 27 , 2021 US 11,076,153 B2 Sheet 10 of 14 90 00 20 OB BOL Beh 90 10 . TOT OL og 50 650 5 05 on more 90 60 50 US = 40 . oleh OF OE 30 30 20 2 20 ha hinh 10 5 32.5 30.012.)0 Mbps(4,5BPS2.04.0 0 3.125 2. 50 1.375 490 40.0 37.UNS ) s (time FIG . 15 iFnrdaemx . 40 42 4 35 28 ? waone config U.S. Patent Jul . 27 , 2021 OL den 20 US 11,076,153 B2 Sheet 11 of 14 8 06 00 og 08 sa za ma g Og 5 Og 50 50 40 40 30 30 30 OC 20 20 . OT OT 40.0 37.535.0 ZE S 12.0 4.5 ? )Mbps( BPS wexaput 4.0 0 3.125 2. 50 499 FIG . 16 ? 42 db 87 150 OE U.S. Patent Jul . 27 , 2021 Sheet 12 of 14 US 11,076,153 B2 200 Videos 202 Configurations Encode Videos 204 Linear Regression 206 Forward Models (Quality , Bitrate , Encoding Time ) 208 FIG . 17 function Video EncodingAndForward Models Input: input videos Vd, configuration files Cnf, parameter values Prmval. Output: equations of forward models Fwdeq for (each Video in Vd ) for ( eachConfigutation in Cnf) Encode video and extract parameter values in terms of QP, SSIM , Frame Rate and Bitrate and store in Prmval. end for end for for ( SSIM / Frame Rate / Bitrate in Prmval) Train and validate forward individual regression models for SSIM , Frame Rate and Bitrate . For each model create an equation in store it to Fwdeq. end for end function FIG . 18 U.S. Patent Jul . 27 , 2021 US 11,076,153 B2 Sheet 13 of 14 302 300 303 M 304 ForwardModel 1 -306 304A 304B 308 CTime)Real-V(oansrtyaintgs 304C |GOPLevelAdaptation Select Optimal 310 Configuration I FIG . 19 312 U.S. Patent Jul . 27 , 2021 Sheet 14 of 14 US 11,076,153 B2 function Adaptive Encoding Input: equations of forward models Fwdeq , parameter values Prmval, group of pictures GOP, time varying constraints Tvc, parameter constraints Prmcns, forward models predictions Fwdmdpd. Output: new encoding parameters Nencpm . #Time varying constraints initialization Initialize SSIM constraint in Tvc Initialize Frame Rate constraint in Tvc Initialize Bitrate constraint in Tvc for ( eachGop in GOP ) if ( current Tvc ! = previous Tvc) then Create Prmcns for Tvc else Use empty current setting end if for ( each Forward Model in Fwdeq ) Input Prmcns in each Forward Model . Use Newton's algorithm to predict QP values as Fwdmdpd and create initial candidate configurations for (eachForward ModelPrediction in Fwdmdpd) Create inverse model equation and optimal solution and predict quantization parameter value that meets SSIM , Frame Rate and Bitrate constraints. These are the final candidate configurations end for end for Selectoptimal configuration and store values to Nencpm . end for end function FIG . 20 US 11,076,153 B2 1 SYSTEM AND METHODS FOR JOINT AND 2 The design of most video coding standards is primarily ADAPTIVE CONTROL OF RATE , QUALITY, AND COMPUTATIONAL COMPLEXITY FOR VIDEO CODING AND VIDEO DELIVERY CROSS - REFERENCE TO RELATED APPLICATIONS This application claims the benefit of U.S. Provisional Patent Application No. 62 / 199,438 filed Jul . 31 , 2015 , incorporated by reference . FEDERALLY -SPONSORED RESEARCH OR DEVELOPMENT This invention was made with government support under CNS1422031 awarded by the National Science Foundation (NSF ) . The government has certain rights in the invention . FIELD OF THE INVENTION aimed at having the highest compression efficiency, or ability to encode video at the lowest possible bit rate while maintaining a certain level of video quality. High -efficiency 5 video coding (HEVC ) , also known as H.265 , is a video compression standard that has provided substantial improve ments to video compression . Compared to H.264 , HEVC aims at a 50 % bit rate reduction at equivalent video quality levels . Unfortunately, bitrate performance improvements 10 come at substantial increase in computational complexity . HEVC benefits from the use of larger coding tree unit (CTU) sizes to increase coding efficiency while also reduc ing decoding time . HEVC also uses other coding tools . 15 These coding tools include context -adaptive binary arithme tic coding (CABAC ) as the only entropy encoder method, transform units ( TUS ) to code the prediction residual, recur sive coding, complex intra -prediction modes and asymmet ric inter prediction unit division . In addition , two loop filters 20 are applied sequentially, with the deblocking filter ( DBF ) applied first and the sample adaptive offset ( SAO ) filter applied afterwards. The invention relates generally to computer software for At a higher - level, for inter encoding , HEVC relies on the video communications. More specifically, the invention use of Group Of Pictures (GOP ) configurations to achieve relates to image processing, intra -mode video encoding, and 25 different levels of performance. Video encoding efficiency inter -mode video encoding that is compatible with the depends heavily on the GOP configurations. high -efficiency video coding ( HEVC ) standard . There has been strong research interest in reducing HEVC The following patent applications are incorporated by encoding complexity for both inter- and intra - coding. Inter reference : U.S. patent application Ser. No. 14 /069,822 filed coding compresses pictures based on their GOP configura Nov. 1 , 2013 , now U.S. Pat . No. 9,111,059 ; U.S. patent 30 tion . Intra - coding compresses each picture independent of application Ser. No. 14/ 791,627 filed Jul. 6 , 2015 ; and the other. For reducing the computational complexity for International Patent Application PCT/US14 /70371 filed inter coding , for example, use of different configuration Dec. 15 , 2014 , now U.S. patent application Ser. No. 15/103 , modes have been introduced . Methods that have been used 977 . for reducing the computational complexity for intra -coding 35 include the use of a rough mode set (RMS ) , gradient based BACKGROUND OF THE INVENTION intra -prediction, and coding unit (CU) depth control. Unfor tunately, these prior approaches did not take into account Computer systems include hardware and software . Hard- that video compression requirements can jointly vary with ware includes the physical components that make up a network conditions, energy /power constraints, or varying computer system . Software includes programs and related 40 expectations of video quality. Thus, it is not sufficient to data that provide the instructions for telling computer hard- reduce computational complexity without considering the ware what to do and how to do it . implications on bitrate and video quality. Computer system hardware includes a processor that permits access to a collection of computing resources and process , or other resource for a limited or defined duration . Although HEVC is considered a high -efficiency codec, there is a need to jointly control bitrate, video quality, and computational complexity for both intra-coding and inter coding . The invention satisfies this demand . digital signal processor configured to carry out the instruc SUMMARY OF THE INVENTION components that can be invoked to instantiate a machine, 45 A processor may be special purpose or general- purpose tions of a computer program by performing the basic arith metic , logical, control and input/output (1/0 ) operations 50 The invention is directed to adaptive methods that can specified by the instructions. Specifically, a processor or adjust video compression parameters and jointly control central processing unit ( CPU ) —includes a processing unit computational complexity, image quality , and bandwidth ( or and control unit ( CU) . Most modern CPUs are micropro- bitrate ). The system and methods simultaneously minimize cessors contained on a single integrated circuit ( IC ) chip. A computational complexity, maximize image quality, and computer system also includes non -transitory computer- 55 minimize bandwidth subject to constraints on available readable storage medium such as a main memory , for energy /power, bandwidth , and the minimum level of accept able video quality. The proposed system and methods extend example random access memory (RAM ). Computer systems may include any device through the the previously filed patent applications that are cited above use of which implements the methods according to the by providing effective methods for: (i ) generating a Pareto invention, for example as computer code . Computer systems 60 front for intra-coding by varying CTU parameters and the may include, for example, traditional computer, portable QP, ( ii ) generating a Pareto front for inter-coding by varying computer, handheld device , mobile phone , personal digital GOP configurations and the QP , (iii ) real - time and offline assistant, smart hand -held computing device , cellular tele- Pareto model front estimation using regression methods , ( iv ) phone , or a laptop or netbook computer, hand held console determining the optimal encoding configurations based on or MP3 player, tablet , or similar hand held computer device , 65 the Pareto model by root finding and local search , and ( v ) such as an iPad® or iPhone® , and embedded devices or robust adaptation of the constraints and model updates at those that contain a special -purpose computing system . both the CTU and GOP levels . The system and methods 3 US 11,076,153 B2 4 apply to both inter - coding ( each picture is compressed ling the minimum size of the coding unit (CU) . The mini independent of the other) and intra - coding (pictures are mum size encoding parameter is used to ensure hierarchical compressed in groups ). partitioning. An increase in the minimum code size always Advantageously, the system and methods of the invention results in better coding performance since there are more can be applied to both intra -coding and inter-coding for the 5 choices. Thus, increasing the minimum code size increases high -efficiency video coding (HEVC ) , previous, and future quality, increase computational complexity, and bitrate. video encoding standards. Similarly, decreasing the minimum code size decreases The invention designs methods that can solve mincec ( T, quality, computational complexity, and bitrate . R , -Q ) with T representing encoding time per frame, R Another object of the invention is static and dynamic representing the number of bits per sample, C representing 10 control of rate - quality -performance. According to the inven the set of all possible video encoding configurations, and Q tion , the rate - quality -performance surface depends on the representing a measure of video quality ( e.g. , PSNR of minimum coding size and QP and uses the model to imple average SSIM )—the negative sign expressing maximum ment the minimum bitrate, maximum quality, and maximum quality ( and hence minimize -Q ) . The multi -objective sur- performance modes . The approach also allows dynamic face of solutions that satisfy mince ( T, R , -Q ) forms the 15 switching between modes . For example , using an HEVC Pareto front. The invention describes optimization methods standard test video and the dynamic reconfiguration between that select encoding configurations c E C that produces low, medium and high profiles proved to meet constraints points on the Pareto front. 93 % (low ), 83 % (medium ), 93 % ( high ) — , while delivering The invention uses a controller embedded in software to encoding time savings of 13 % , 49 % and 40% respectively. handle the optimization process. The controller is provided 20 The invention uses cross -validated regression to quickly with measurements of encoding time , rate, image quality build optimal models since thousands of possibilities do not and constraints (e.g. , available network bandwidth , available need to be evaluated . A root finding algorithm is used to battery energy, user determined quality ). For intra -coding, solve for the optimal values . These solutions are used by a the controller dynamically adjusts CTU configurations and relaxation procedure to find actual, integer -based , software the quantization parameter (QP ) . For inter - coding, the con- 25 parameters. troller dynamically adjusts the GOP configurations and the The invention also applies to inter -mode HEVC encoding. QP. The dynamic control is used to realize the optimization For inter mode HEVC encoding, encoding efficiency modes listed above in the approved patent application. depends heavily on the GOP configurations. Initially, for The invention provides constraint optimization solutions inter -mode HEVC encoding , the approach generates Pareto to the minimum computational complexity mode, the maxi- 30 front models using an offline process . These models are used mum quality mode, and the minimum bitrate mode . For to adapt to time -varying constraints during real -time opera example, video quality may be related to application -mo- tion . Thus, an advantage of the invention is an offline dality level adaptation , bitrate demands may be related to process of video encoding including forward model creation wireless network adaptation and encoding frame rate may and another advantage is the real -time adaptation to time relate to device adaptation for real -time operation. For each 35 varying constraints - for example state of a wireless network mode , one of the objectives (e.g. , computational complexity, to guarantee acceptable performance throughout a streaming quality, or bitrate ) is optimized, while suitable constraints session . Yet another advantage is the adaptation to con are placed on the other two . For example, for the minimum straints of modes — maximum video quality, minimum computational complexity mode , the invention minimizes bitrate , maximum frame rate on a GOP basis . computational complexity of HEVC subject to constraints in 40 The invention and its attributes and advantages may be bitrate and reconstruction quality. The constraint-optimiza- further understood and appreciated with reference to the tion approach provides an extension to the use of bit detailed description below of one contemplated embodi constrained rate - distortion optimization by also minimizing ment, taken in conjunction with the accompanying draw or constraining computational complexity . Overall, the ings . invention provides joint control of reconstructed video qual- 45 BRIEF DESCRIPTION OF THE DRAWINGS ity, computational complexity, and compression rate . For intra -mode HEVC encoding , the approach uses a configuration parameter that controls the partitioning of the The preferred embodiments of the invention will be coding tree unit ( CTU) so as to provide for finer control of described in conjunction with the appended drawings pro the encoding process. By jointly sampling the quantization 50 vided to illustrate and not to limit the invention. FIGS . 1-16 parameter ( QP) and the CTU configuration mode , the are directed to intra -coding and FIGS . 17-20 are directed to approach generates a finely - sampled , Pareto - optimal, rate- inter -coding, where like designations denote like elements, and in which : quality -performance surface . The quantization parameter ( QP ) and a quad -tree -depth FIG . 1 is a block diagram of the intra -coding system and oriented coding tree unit ( CTU) configuration are adaptively 55 methods of the invention . controlled to deliver performance that is optimal in the FIG . 2 illustrates a figure of the CTU partition control complexity -rate -quality performance space . The invention based on the config parameter according to the invention . employs a spatially adaptive model that uses neighboring FIG . 3 is a plot diagram of a rate -distortion -complexity configurations to estimate optimal values for QP and the performance example for intra -coding according to the coding tree unit configuration (CTU) . More specifically, the 60 invention . invention provides a robust, spatially -adaptive control algoFIG . 4 illustrates a model update using 3 neighboring rithm for solving the minimum bitrate, maximum quality, CTUs according to the invention . and minimum computational complexity optimization probFIG . 5A and FIG . 5B illustrates pseudo code of a common lems . framework for intra - coding mode implementation according One object of the invention is Hierarchical coding unit 65 to the invention . ( CU) partitioning for fine, joint control of rate - qualityFIG . 6 illustrates a model update for the first row and the performance. Intra - encoding control is achieved by control- first column according to the invention . 5 US 11,076,153 B2 6 FIG . 7 illustrates a performance constraint model update important than the other. However, while allocating more using neighbor CTUs according to the invention . resources to , for example, performance, the system strives to FIG . 8 illustrates a table of constraint violation objectives maintain optimal energy , power, and accuracy at the highest level without taking away from performance resources . As according to the invention . example, digital video processing requires significant FIG . 9 illustrates pseudo code of the time- quality rela- 5 an hardware resources to achieve acceptable performance. tionship model update for minimum bitrate mode for intra The invention is directed to a system and methods for coding according to the invention . of software parameters for various FIG . 10 illustrates pseudo code of the constraint updates dynamic reconfiguration such as digital signal, image , and video . For for minimum bitrate mode for intra -coding according to the 10 applications applications such as digital signal, image , and video , con invention . FIG . 11 illustrates pseudo code of the quality -rate rela tionship model update for minimum computational com plexity mode according to the invention . FIG . 12 illustrates pseudo code of the constraint update 15 for minimum computational complexity mode according to the invention . FIG . 13 illustrates pseudo code of the time- rate relationship model update for maximum quality (minimum distortion mode) according to the invention . 20 FIG . 14 illustrates pseudo code of the constraint update for the minimum distortion mode according to the invention . FIG . 15 illustrates a graph of the results of current methods of only using fixed CTU configuration while varying the QP only that cannot be used to achieve real - time 25 control of rate - complexity - quality. FIG . 16 illustrates a graph of the results using optimal QP and CTU configuration to achieve optimal and real - time control of rate -complexity -quality for intra -coding accord ing to the invention. FIG . 17 illustrates a flow chart of an offline process of 30 video encoding and forward model creation for inter -coding according to the invention . FIG . 18 illustrates pseudo code of the offline process of straints may include, for example, dynamic power / energy consumption, performance, accuracy, bitrate, and quality of output or image reconstruction quality. An optimal approach for jointly controlling rate- quality complexity for both intra -mode and inter mode is provided . According to the invention , an effective control mechanism model dynamically adjusts the quantization parameter ( QP ) and the coding tree unit (CTU) partition mechanism so as to achieve variable constraints on bitrate and video quality. The model is dynamically updated based on the input video . More specifically, the invention provides a new , efficient implementation of the minimum computational complexity mode , maximum image quality mode , and the minimum bitrate mode. For all of the modes , video encoding configu rations are specified so that they produce mincec ( T, R , -Q ) with T representing encoding time per frame, R representing the number of bits per sample , C representing the set of all possible video encoding configurations, and Q representing a measure of video quality. In order to jointly control T, R and Q , bounds can be provided on each one of them . For improving performance and guarantee computations within specific time limits , T , denotes an upper bound on the encoding time . Similarly, for communicating within a specific bandwidth , Rmax denotes max an upper bound on the available bits per pixel . Then , to video encoding and forward model creation for inter - coding 35 guarantee a minimum level of quality , Qmin denote a lower according to the invention. FIG . 19 illustrates a flow chart of a real- time adaptation using time -varying constraints for inter - coding according to the invention . FIG . 20 illustrates pseudo code of the real -time adaptation using time -varying constraints for inter -coding according to 40 the invention . DETAILED DESCRIPTION OF EMBODIMENTS OF THE INVENTION bound on the encoded video quality . Thus, in general, it is desired to encode configurations that jointly satisfy : (RsRmax ) & ( TsTmax) & ( QzQmin ) . The following optimization modes are considered : maxi mum performance mode , minimum rate mode , maximum quality mode . The maximum performance mode provides the best com putational performance by minimizing encoding time. An acceptable, optimal encoding configuration is obtained by solving : 45 The following patent applications are incorporated by Equation ( 1.1 ) mint subject to : ( Q2 min ) & ( R = Rmax) CEC reference : U.S. patent application Ser. No. 14 /069,822 filed Nov. 1 , 2013 , now U.S. Pat . No. 9,111,059; U.S. patent application Ser. No. 14/ 791,627 filed Jul. 6 , 2015; and International Patent Application PCT/US14 /70371 filed 50 The minimum rate mode reduces bitrate requirements Dec. 15 , 2014 , now U.S. patent application Ser. No. 15/103 , without sacrificing quality or slowing down encoding time 977 . to an unacceptable level. The optimal configuration requires Dynamically reconfigurable frameworks offer unique the solution of: advantages over non -dynamic systems. Dynamic adaptation provides the ability to adapt software and hardware 55 resources to meet real - time varying requirements. Equation ( 1.2) minR subject to : ( Q2 Qmin ) & ( T < Tmax ) CEC Embodiments of the invention include a system and methods for improving resource management in embedded computer systems. The managed resources (or objectives) may be directed to constraints. The term constraint is also 60 The maximum quality mode : provides the best possible referred to as real- time constraint or time-varying constraint. quality without exceeding bitrate or computational require Time -varying constraints include, for example, constraints ments. The optimal encoding is selected by solving: on the supplied power, required performance, accuracy levels , available bandwidth , and quality of output such as image reconstruction . It is contemplated that constraints can Equation ( 1.3 ) max subject to : ( T < Tmax ) & ( R = Rmax ) CEC be generated by a user, by the system , or by data inputs. 65 During operation of a computer system , various states may exist in which one or more of the constraints is more 7 US 11,076,153 B2 An advantage of the invention is that the modes given by Equations ( 1.1 ) - ( 1.3 ) can be used to describe a large number of different, practical, scenarios. For example, for video streaming applications, Tmar can be set to Tmax = 1/fps where video is generated. As another example , adapting to a 8 The example is based on the first 6 frames of a video ( 832x480 ) referred to as the standard RaceHorsesC to produce the median objective surface plot shown in FIG . 3 . To generate the space , QP is varied in the range of [ 6 , 51 ) config. In total , there are 340 possible combinations that fps denotes the number of frames per second at which the 5 with a step of 3 and all 14 possible values are considered for time -varying communications channel may be achieved by have been verified to be optimal in the multi -objective sense setting Rmax to the time-varying, available bandwidth . ( Pareto optimal ) . As expected, as config is increased better An advantage of the invention includes the development 10 Rate -Distortion performance is obtained at the price of of a control mechanism that solves the optimization problem increased computational complexity. On the other hand, given in Equation ( 1 ) for HEVC intra- encoding based on the higher values of QP produce configurations that require Coding Tree Unit (CTU) level . Another advantage of the lower bitrates with lower quality and reduced computational invention includes the effective implementation of the con- complexity. trol mechanism using CTU performance models. A simple linear model is considered for describing the 15 FIG . 1 is a block diagram of the system and methods of relationship between the objectives and the parameters. the intra - coding optimization process 100 according to the Q =aQP + b 1 Config + C1 invention . A Dynamically Reconfigurable Architecture Sys tem for Time - varying Image Constraints (DRASTIC ) con troller or processor 102 is provided with measurements of 20 T = a2 QP + 62 Config + C2 encoding time 102A , rate 102B , and image quality 102C that the processor 102 uses to select methods for splitting the Equation ( 2 ) R = az QP + b ; Config + C3 coding units (CU) 104 and transform units (TU) 106 and to set the quantization parameter (QP ) 108 for the next incomwhere Q is measured in terms of the mean squared error ing frame. 25 ( MSE ) , T denotes the time in ns ( 10-9 second ) required for Optimal configuration management is based on scalable processing a single pixel , and R denotes the number of bits parametrization . The optimal configuration is based on a per sample . quantization parameter ( QP ) and a scalable parametrization The linear model of Equation (2 ) needs to be updated of the CU tree based on config. It is noted that QP affects throughout the video frame. This model is dynamical and encoding time since larger QP values result in smaller 30 adjusts to the input sequence . The model may be updated bitrates, lower quality, and lower encoding times since there based on local measurements . are fewer coefficients to encode . On the other hand, config The invention allocates time , quality, and rate to each is used for controlling the search space for specifying the CTU by controlling QP and Config. A feedback loop is used coding unit sizes . FIG . 2 illustrates a figure of Scalable Coding Tree Unit 35 tocontrol provide measurements time, quality , andisrate to the . The mainof control algorithm presented (CTU ) partitioning following a breadth - first - search splitting in FIG . algorithm 4 and FIG . 5A , FIG . 5B . The basic idea is to encode pattern . Each block is recursively partitioned into four sub - blocks using a quadtree decomposition. The case of each CTU independently while staying within the budget config = 6 is shown in FIG . 2. The labeled partitioned block allocated to the entire frame. ids are also shown with the CU partition control based on the 40 FIG . 4 illustrates a block diagram of a model update using config parameter. The config parameter is allowed to vary 3 neighboring CTUs according to the invention. A shown in from 0 to 13. Here , scalability is achieved by making sure FIG . 4 , the CTU is indexed as (CTU ,, CTUx ), the 3 neighbor that the search space uses a nested subset of the full partition CTUs are indexed as ( CTU??, CTU , -1 ), ( CTU , -1 CTU , -1) tree . The quad - tree partition process is controlled using a and (CTU , -1, CTU . ). When the neighboring CTUS share process_id ( “ proc. id ” ) as shown in FIG . 2 , a depth first 45 encodings, the model is constructed using the best predic search ( DFS ) . Here, the config parameter gets mapped to a tions as described below. Thus, it is possible for a model to maximum value of the process_id . Thus, partitioning select model parameters. FIG . 5A , FIG . 5B illustrates a beyond the maximum value of the process_id is not con- common framework for mode implementation according to sidered. For example, for config = 0 , any splitting is not the invention. considered . For config = 1, the original 64x64 coding unit can 50 Budget allocation is now described . Budget allocation be split into 4 32x32 regions, but splitting is allowed except refers to not only to bit allocation , but also quality and for the first 32x32 region. The decision on whether splitting computational complexity allocation . For target rate , quality is optimal or not is decided using RD optimization . For and computational complexity, the following are used : config = 6 , the search tree is illustrated by “ A ” in FIG . 2. Tree Rarger Qtarget and Ttarger. Bits per sample ( all is referred as space search is performed using depth first search ( DFS ) . It 55 pixel in video encoding ) is used for the rate, Peak Signal is contemplated that the invention may be applied to TU to - Noise Ratio (PSNR) , Mean of Square Error (MSE) , and control also , unless a split is needed , i.e. there is no 64x64 Sum of Square Error ( SSE ) for image quality, and nano TU , a split to 32x32 TU is accepted. As shown by “ B ” in seconds per sample for computational complexity measure FIG . 2 any splitting for processes with id> 9 is prohibited . ments . Performance budget allocation is based on the pre The proposed scalable approach can be used to generate 60 computed mean absolute deviation (MAD ) computed by the a Time -Rate - Quality performance space as shown in FIG . 3 . HEVC reference standard . FIG . 3 is a plot diagram of a rate -distortion - complexity Bit allocation requires that encoding bits are assigned for performance example according to the invention for intra- each CTU . The bit allocation strategy is not simple average coding. For each plot the following is measured: ( i ) time bit allocation for all CTUs . Instead , bit allocation is based on using the number of seconds per sample ( SPS ) , ( ii ) rate 65 pre - computed MAD that also take into account uncon based on the number bits per sample ( BPS ) , and (iii ) quality trolled, internal factors of the HEVC that are associated with live video streaming. using PSNR (dB ) . US 11,076,153 B2 9 10 where Timetarget denotes the number of seconds allocated per frame. The total amount of time allocated to the entire The required number of bits per pixel bpp target is estimated using: Rtarget / f - HeaderBits boptarget : Npixels frame Ttarget is given by : Equation ( 3.1 ) 5 Ttarget = Npixels-time_per_pixeltarget Equation (4.2 ) The amount of time left for encoding the remaining CTUS Tient is given by : Equation (4.3 ) Tleft = Ttarget- Tcoded where Rtarget denotes the target number of bits per second for each video frame, f denotes the number of frames per 10 where Tcoded refers to the total amount of bits already used . second, Npixels denotes the number of pixels in each frame, The allocated time for each CTU is adjusted using Tadj given and HeaderBits = 25 are used for storing the header for HEVC intra - frame encoding. Each frame gets Rtarget bits using: Rtarget = Npixels , bbp target by : Equation ( 3.2 ) Using Rcoded the total number of bits already used in the current frame, the number of bits remaining is estimated for the rest of the image using : 20 Equation ( 3.3 ) Rleft = Rtarger - Rcoded Equation ( 4.4) based on remaining MAD to cover , as done for the rate . The allocated time for entire CTU is similarly update using : Equation (4.5 ) Tallocated = Tleft - Tadj Finally, the amount of allocated for the CTU is given by its share of the remaining MAD : where Rieft denotes the number bits allocate in the budget that are still available. With Radi referring to the budget correction needed to make based on mean absolute 25 Trangeri= { Dremaining deviation (MAD ) such that Rad; is used as given by D; Tallocated Equation ( 4.6 ) Equation ( 3.4) Rallocated = Rleft - Radj to modify the number of bits that have been allocated for the entire frame. The budget is adjusted using : Radi = Reated – (1 .). Target Dleft Diotal Tad; = T coded - ( 1 15 Dieft Diotal · Rtarget 30 Equation ( 3.5 ) 35 Image quality is measured using the PSNR . At the CTU level , it is more efficient to work with the sum of squared error ( SSE ) . Thus , there is a need to convert back and forth between PSNR and SSE budget requirements. As for rate and computational complexity, allocation is based on the MAD . PSNR requirements are converted into SSE requirements using: where Dieft refers to the pre -computed MAD sum for the remaining CTUs , and Dtotal refers to the total MAD allo cated for the current frame. The goal of Equation 3.5 is to 22bitDepth · Npixels Equation ( 5.1 ) Qtarget = SS Etarget weight bit allocation to be proportional to the remaining 40 10PSNR / 10 MAD that needs to be accounted for. After encoding each CTU using Equation 3.5, Dieft gets reduced. Dieft should converge to zero . Thus, effectively, the use of Equation 3.5 where SSEtarget refers to the allocated SSE for the entire is meant to ensure that the remaining CTUs get a number of 45 frame, and bitDepth refers to the number of bits used to bits that is proportional to their contribution towards the represent each pixel . After encoding a CTU , the remaining reduction of Dtotal to zero . After updating Rallocated by SSE budget is similarly given by: substituting Equation 3.5 into Equation 3.4 , the number of bits is allocated for the current, i -th CTU using : 50 D; Rtarget,i Dremaining Rallocated Equation ( 3.6 ) where D , refers to the MAD reduction associated with the 55 i -th CTU, Dremaining aining refers to the MAD still left to do for the entire frame. Similar to bit allocation , the computational complexity budget for each CTU is based on the pre - computed MAD . The encoding time per pixel time_per_pixeltarget is com- puted using: target time_per_pixelarget Time Npixels Lieft = Qtarget- Qcoded Equation ( 5.2 ) Qadi = Qcoded – (1 Dtotal Duet).Quarze Equation ( 5.3 ) Qallocated = Qleft - Qadi Equation ( 5.4) Adjustments are similarly made using: and Also , the CTU SSE is given by : 60 SSEtarget,i Equation (4.1 ) 65 = Di .).· SSE allocated Dremaining Equation ( 5.5 ) Significant content variation can lead to mis -prediction of is taken if the variations stay within the budgets. However, the required budgets for each frame. In such cases , no action US 11,076,153 B2 11 when mis - prediction results in budget deficits, the remaining budget needs to be reallocated to avoid significant artifacts in the reconstructed video . Thus, after the budget is used up , the remaining budget needs to be adjusted to minimize the 5 budget violation . Budget violations are reduced by reducing the estimates of the remaining budget using : Badi= a :(Dider / D ;).Btarget Equation ( 6.1 ) Tadj = a :(Di,iefdD ;).Ttarget Equation ( 6.2 ) Equation ( 6.3 ) SSEadjFa :(Di,lef/ D ;).SSEtarget where a was set to 0.15 after experimenting with different videos . Clearly, a=0 would lead to significant artifacts while a = 1 would not attempt to minimize budget violations and would thus allow significant changes in video content to violate the constraints . The rate - quality - complexity model is spatially adapted to the input video content. A linear model is built based on the encoding of three neighboring CTUs as depicted in FIG . 4 . With i = 1 , 2 , 3 denote the neighboring CTUs and each CTU encoded using the pair of (QPi , Config .) to results in ( SSE ;, Ti , R; ) . To estimate the linear model , the parameter matrix A is defined using : 10 Equation (7.1 ) a3 b3 c3 Then the basic linear model is described by : SSE; al b1 c1 T; QP; a2 b2 c2 || Config; R; ?? b3c3 Equation (7.2) For robust model update, the case is also considered when the neighboring CTUs do not use 3 independent encodings. In this case , [ a , b ; c ; ] is selected as associated with the best predictions. To implement this approach , for the i - th CTU , the prediction errors are computed using : SSEerror, = ISSE ; -a , QP ; -b , Config: -ci ! Rerror,i= \R ,-a2 QP ;-b2 Config :-cz! Terror,i= 1T;-az QP ;-bz.Config;-C31 The model is then built by using the coefficients associ ated with the minimum prediction errors. For example, for A1, [al,i b1 , C1, ], the following is solved : 15 20 Equation ( 7.6 ) min ; SSEerror, i and Ajj isEquation used to 7.6 associate model that minimizes ( see alsowithFIG .j-th4 ) . CTU Another problem occurs in coming up with an initial model for the first row and first column in each frame. For this case , virtual CTUS are created above the first row and to the left of the first column as shown in FIG . 6. The virtual CTU encodings and then updated based on the encodings of the first few frames of the current video . More specifically, for each virtual CTU , the Pareto front based is computed on the average of the current encodings. According to one embodiment, and initial model trained on other videos may be used . After a few frames, the Pareto front is computed from the current video . Here, it is noted that the Pareto front is obtained through an exhaustive assume the Pareto front that is initialized from other videos 25 30 al b1 c1 AE a2 b2 c2 12 evaluation of all possible Config and QP values . However, the cost of estimating the Pareto front is restricted to CTUS over a few frames and offline computations using other videos . Updated linear models are used to estimate values for QP 35 and Config that can satisfy the constraints and minimize bitrate , maximize quality , or minimize computational com plexity. In addition , the invention provides a robust approach for minimizing constraint violations . 40 The minimum bitrate mode is used to demonstrate the basic concepts . All other models are similar. As explained above , the constraints are used to determine target values for Q , T, R as needed . For the minimum bitrate mode , it is desired match the constraints on quality Qtarget and time TOtarget The linear model is used to determine the encoding Suppose that the 3 CTU encodings use 3 different pairs of 45 parameters: ( QP;, Config . ). In this case , it is expected that the 3 rows of [ QP;, Config; 1 ] should also be linearly independent since Qtarget the ranges of QP and Config are quite different. Thus, when Ttarget working with three different CTU encodings , the parameters 50 le al b1 c1 a2 b2 c2 can be estimated using: al b1 = cl QP Configi QP2 Config2 QP3 Config ; a2 QP Configi b2 = QP2 Config2 cl | QPz Config ; a3 13 c3 = [ QP Configi 1 QP2 Config2 QPz Config SSE 55 SSE : TE OPest content - Panzer Configest Equation (7.4) T2 Ri 60 Equation (7.5 ) al bl a2 62 Qtarget - cl Ttarget - c2 Equation ( 8.2) QPest and Configest are rounded to the nearest integer values and the model used as given by : Q = Q1 QP +by Config + C1 R2 R3 Equation ( 8.1 ) Using Equation 8.1 , the initial values of the encoding parameters are estimated using : Equation (7.3 ) SSE2 Ti QP ; Config; 65 Traz: QP + b2.Config + C2 R =az QP + bz.Config + C3 Equation ( 8.3 ) 13 US 11,076,153 B2 to perform a local search with QP E [QPest - 2, QP est+ 2 ] and Config E [ Configes - 2, Configes + 2] for the minimum bitrate that also satisfies the constraints. Alternatively, if no parameters can satisfy the constraints, the normalized constraint violations is computed using : 5 14 One embodiment of the invention is applied to a dynamic reconfiguration example referred to above as the standard RaceHorsesC to demonstrate the advantages of the inven tion . Specifically, the goal of the following example is to demonstrate the ability to switch from a low profile mode to a medium and then back to a high profile mode . The low, medium , and high profiles are defined by fixing Equation ( 8.4) QP to QP =37 , 32 and 27 , respectively . Furthermore , for norm ( X ) = X – Xmin Xmean comparing to the proposed approach , for controlling both 10 the bitrate and PSNR , the full range depth configuration ( config = 13) is used and the resulting PSNR constraints Then , a ( QP, Config ) pair is selected that minimizes the reduced a little bit to generate the low, medium , and high total normalized constraint violation as given in FIG . 8 for profiles . the minimum bitrate mode . The results are compared for the fixed QP configuration Similarly, for the maximum quality mode , the target budget values are first used for bitrate and performance to 15 shown inmodeFIG according . 15 with tothetheminimum inventioncomputational shown in FIGcom . 16 . determine initial estimates and select optimal encoding plexity For constraint satisfaction , mild violations may be allowed parameters based on local search or minimum constraint in the order of 10 % of the constraints . As shown in FIG . 16 , violation. Then , for the minimum computational complexity it can be seen that DRASTIC control achieves constraint mode , the target bitrate and quality is used for the initial 20 satisfaction at the high rates of 93 % for low, 83 % for search . , and 93 % for the high profile. Furthermore, com While the linear model is simple and robust , it can fail to medium pared to the fixed QP results, the invention achieves savings produce valid values for QP and Config . This failure occurs of 13 % for , 49 % for the medium , and 40 % for the because the linear model does not impose any restrictions on high profile .theThelowinvention proves not only to meet given the constraints . Thus, the constraints end up being signifi 25 constraints, but while also minimizing the encoding time . cantly above or below the rate - performance - quality surface . . 17 illustrates a flow chart of an offline process 200 When the constraints are significantly off, they are auto of FIG video encoding and forward model creation for inter matically modifies to bring them close to the control surface. For valid encodings, it is required that QP E [ 0 , 51 ] and coding according to the invention . With the objective to determine a suitable model to be used to determine the most Config € [ 0 , 13 ] . When either parameter falls out of range , 30 relevant encoding configuration parameters that affect video the constraints are modified to produce valid encodings . , bitrate, and frame rate, videos at step 202 are In general, rate, constraint, and computation complexity quality encoded at step 204 . are non - linearly related . The linear model according to the To determine a suitable model for each of the afore invention is excellent for local approximations to the non described encoded video characteristics a linear regression linear relationship . is employed at step 206 to identify and select the most The relationship between any pair of constraints is pro 35 model important encoding parameters ( profile, encoding structure , vided using : GOP structure , QP, max intra period) to construct the relevant forward model . Stepwise regression is used to both T= a1.SSEBI , al >0 , b1 <0 . select important parameters as well as reduce the dimen 40 sionality of the encoding parameter vectors to determine at SSE = a2.Rb2, a2 >0 , 62 <0 . step 208 the following optimal models : T= a3 -Rb3 , a3 > 0 , b3 > 0 . Equation ( 8.5 ) Following are explanations of how to modify the con straints for the minimum rate algorithm . As for the linear 45 model , the neighboring CTU encodings are used to adap tively estimate the relationships between the constraints as shown in FIG . 7 . log( SSIM ) = 2 , QP +bo log( Bitrate )= a , QP +b1 log ( FPS ) = a2 QP + 62 Equation ( 9.1 ) FIG . 18 illustrates pseudo code of the offline process of The main algorithm for estimating T = a :SSE " is given in video encoding and forward model creation for inter -coding FIG . 9. Based on the relationship , either the quality or the 50 according to the invention . computational complexity constraint is moved to lie on the FIG . 19 illustrates a flow chart of a real -time adaptation curve as given in FIG . 10. Similarly, for the minimum 300 using time -varying constraints for inter -coding accord computational complexity mode, SSE =a : R" is estimated as ing to the invention . For each of the three forward models given in FIG . 11 and the constraints updated as given in FIG . shown in Equation 9.1 , an inverse process is applied at step 12. The model update and algorithm for the maximum 55 308 to predict the optimal quantization parameter values that quality (minimum distortion ) model is given in FIG . 13 and meet the input constraints. According to one embodiment, FIG . 14 . To account for the case of failing to estimate the model , Newton's algorithm may be used to find a solution to the forward model that describes the most dominant constraint. for example , if the left and top CTUs are encoded in the Depending on the employed mode of operation (minimum same way , the configuration from the last CTU is used . 60 computational complexity mode , the maximum quality Similarly, if the constraint update is excessive , the configu- mode, and the minimum bitrate mode ), mild violations may be allowed . For example, either in the order of -10 % for ration from the last CTU may also be used . The updated constraints are used for estimating new , valid maximum quality mode and frame rate models or in the values for QP and Config. Large changes are prevented by order of + 0.5 % for the minimum bitrate models . When more requiring that the QP to remain within 24 of the average of 65 than one solution in terms of QP is generated, the results are the neighboring CTUs . Furthermore, the final encoding rounded up to the nearest integer QP value since the output parameters are forced to stay within the valid ranges. is a continuous numerical value as shown by FIG . 19. By 15 US 11,076,153 B2 adopting this inverse process , some QP predictions may be found outside from the range of QP used in the encoding parameters such that additional configurations may be ran in order to complete the missing values of SSIM , bitrate and frame rate for the missing predicted QPs . FIG . 20 illustrates 5 pseudo code of the real- time adaptation using time -varying constraints for inter -coding according to the invention . While the disclosure is susceptible to various modifications and alternative forms, specific exemplary embodi- 16 encoding each video segment using different video encod ing parameters, Coding Tree Unit configurations, and GOP configurations; evaluating the video quality, required bitrate , and video encoding rate in frames per second for each video segment; and learning the forward regression models that map the video encoding parameters , Coding Tree Unit configurations, and GOP configurations to the video quality, required ments of the invention have been shown by way of example 10 bitrate, and video encoding rate over a training set of in the drawings and have been described in detail . It should video segments . be understood , however, that there is no intent to limit the 4. The method of claim 1 , wherein the inverse models use disclosure to the particular embodiments disclosed , but on Newton's algorithm to determine final candidate encoding the contrary, the intention is to cover all modifications, configurations from the forward regression models and equivalents, and alternatives falling within the scope of the 15 constraints on video quality, maximum bitrate, and mini disclosure as defined by the appended claims . mum video encoding rate . 5. The method of claim 1 , wherein the optimal encoding The invention claimed is : configuration is one selected from the group : a maximum 1. A method for real -time adaptive encoding digital video video encoding performance mode , a minimum bitrate signals comprising: 20 mode, and a maximum video quality mode . ( a ) receiving an input video comprising a plurality of 6. The method of claim 5 , wherein the maximum perfor video segments ; mance mode is defined according to : ( b ) applying, to a video segment, real -time input con straints on : ( 1 ) video quality remaining above a mini mum value Q , (2 ) bandwidth with bitrate remaining 25 mint CEC subject to : (Q2 Qmin ) & (R = Rmax ) below a maximum value representing available bitrate , and (3 ) encoding frame rate with a number of frames per second ( FPS) remaining above a minimum encodwith C representing a set of video encoding configura ing rate value , to select initial candidate encoding tions , R representing a number of bits per pixel , T configurations, wherein applying further comprises 30 representing encoding time per frame, and Q represent using pre - computed forward regression models, ing a measure of video quality . wherein the pre -computed forward regression models can vary based on an encoding eme, and are given 7. The method of claim 5 , wherein the minimum bitrate mode is defined according to : by : 35 log( Q ) = 2o+ bo QP + co QP , minR subject to : (QzQmin ) & ( T < Tmax ) CEC log ( Bitrate ) = 2, + b , QP + C1: QP , log (FPS ) = az + b2QP + c2QP2, Equation ( 9.1 ) 40 with C representing a set of video encoding configura tions , R representing a number of bits per pixel , T representing encoding time per frame, and Q represent ing a measure of video quality. 8. The method of claim 5 , wherein the maximum quality mode is defined according to : wherein QP is a quantization parameter and ag , bo , Co , aj , b1 , C1 , a2 , b2 , C2 represent regression coefficients determined using a training process that uses video segments similar to 45 the video segments of the plurality; (c ) using the pre -computed forward regression models to derive inverse models to determine final candidate minQ subject to : ( T < Tmax ) & ( R = Rmax ) CEC encoding configurations from the initial candidate encoding configurations; ( d ) selecting an optimal encoding configuration from the 50 with C representing a set of video encoding configura final candidate encoding configurations, wherein the tions , R representing a number of bits per pixel , T optimal encoding configuration satisfies constraints and representing encoding time per frame, and represent achieves a maximum video quality, a minimum band ing a measure of video quality . width , or a maximum frame rate , wherein the optimal 9. The method of claim 1 , wherein the forward regression encoding configuration comprises of a Group of Pic- 55 model is defined in terms of a quantization parameter ( QP) , tures ( GOP ) configuration and a Coding Tree Unit the GOP configuration, and the Coding Tree Unit configu ( CTU ) configuration ; ration . (e ) encoding the video segment using the optimal encod 10. The method of claim 1 , wherein video constraints and ing configuration, and the optimization modes are applied individually in a CTU or ( f) repeating ( b ) - ( e ) for all video segments of the plurality 60 a GOP while staying within a budget. of video segments. 11. The method of claim 10 , wherein the budget com 2. The method of claim 1 further comprising creating prises a target bitrate (Rtarget) of a number of bits per second off - line the pre - computed forward regression models . for each video frame according to the equation: 3. The method of claim 2 , wherein creating further Rtarget= Npixels bbPtarget 65 comprises: inputting a plurality of videos that is composed of video wherein Npixets is a number of pixels in each frame and segments; bbptarget is a required number of bits per pixel . US 11,076,153 B2 17 12. The method of claim 10 , wherein the budget com- prises a target frame rate ( Ttarget) of a total amount of time allocated to an entire frame according to the equation : Ttarget= Npixelstime_per_pixelfarget wherein N pixels , is a number of pixels in each frame and time_per_pixel,arget is an encoding time per pixel. 13. The method of claim 10 , wherein the budget comerror ( SSE ) for an entire frame according to the equation : 14. The method of claim 1 , wherein ( a ) - ( f) are applied to the video delivery system can support live and on demand different video segments in a video delivery system , wherein 5 prises a target video quality (Quaget) of a sum of squared 10 · Npixels Qtarget 22-bitDepth 10PSNR /10 wherein bitDepth is a number of bits used to represent each pixel , N pixels , is a number of pixels in each frame and PSNR is Peak Signal -to - Noise Ratio . 18 settings. 15. The method of claim 14 , wherein the video delivery system includes adaptive HTTP streaming (e.g. , MPEG DASH protocol) and RTP protocol based systems. 16. The method of claim 1 , wherein the encoding scheme comprises a GOP configuration. 17. The method of claim 1 , wherein the encoding scheme comprises encoding parameters that do not include QP. 18. The method of claim 1 , wherein the video quality is 15 one selected from the group : structural similarity index measure ( SSIM) , peak signal - to -noise ratio ( PSNR) , and video multimethod assessment fusion (VMAF ).

Log In

US

Related papers

Related papers