GitHub - EsakaK/LSSVC: Official implementation of LSSVC: A Learned Spatially Scalable Video Coding Scheme.

LSSVC: A Learned Spatially Scalable Video Coding Scheme

Official Pytorch implementation for LSSVC: A Learned Spatially Scalable Video Coding Scheme

Prerequisites

Python 3.6
CUDA if want to use GPU for acceleration
cudatoolkit=11.3
pytorch==1.10.1
torchvision==0.11.2
pytorch-msssim==0.2.1
bd-metric==0.9.0

Build the project

We only provide the test script with real bitstream writing. Please build the C++ code to test with actual bitstream writing.

On Linux

sudo apt-get install cmake g++
cd src
mkdir build
cd build
conda activate $YOUR_PY36_ENV_NAME
cmake ../cpp -DCMAKE_BUILD_TYPE=Release
make -j

Comparing with other method

RD curve with intra period 12

RD curve with intra period 32

Pretrained models

We provide our pretrained models:

Scalable Intra models (IntraSS)
Scalable Inter models (LSSVC)

IntraSS is a degraded version of LSSVC without interframe references.

Command Lines

We provide the command lines of the encoder and decoder of HM-18.0, VTM-21.2, and SHM. The VTM-21.2 is utilized for both simulcast coding and two-layer scalable coding in our experiments.

simulcast coding

For simulcast encoding and decoding, it involves performing two separate single-layer encoding and decoding processes. Here, we summarize the command lines used for single-layer coding.

HM-18.0 Encoder

TAppEncoder
-c encoder_lowdelay_main.cfg
--IntraPeriod={intra period}
--FramesToBeEncoded=96
--InputFile={input file name}
--FrameRate={frame rate}
--SourceWidth={width}
--SourceHeight={height}
--InputBitDepth=8
--InputChromaFormat=420
--DecodingRefreshType=2
--ConformanceWindowMode=1
--QP={qp}
--BitstreamFile={bitstream file name}

HM-18.0 Decoder

TAppDecoder 
-b {bitstream file name}
-o {reconstructed file name}

VTM-21.2 Encoder

EncoderApp
-c encoder_lowdelay_vtm.cfg
--IntraPeriod={intra period}
--FramesToBeEncoded=96
--InputFile={input file name}
--FrameRate={frame rate}
--SourceWidth={width}
--SourceHeight={height}
--InputBitDepth=8
--InputChromaFormat=420
--DecodingRefreshType=2
--OutputBitDepth=8
--QP={qp}
--BitstreamFile={bitstream file name}

VTM-21.2 Decoder

DecoderApp 
-b {bitstream file name}
-o {reconstructed file name}

two-layer coding

SHM-12.4 Encoder

TAppEncoder
-c encoder_lowdelay_scalable.cfg
-c layers.cfg
--IntraPeriod0={intra period of layer0}
--InputFile0={input file name of layer0}
--FrameRate0={frame rate of layer0}
--SourceWidth0={width of layer0}
--SourceHeight0={height of layer0}
--QP0={qp of layer0}
--IntraPeriod1={intra period of layer1}
--InputFile1={input file name of layer1}
--FrameRate1={frame rate of layer1}
--SourceWidth1={width of layer1}
--SourceHeight1={height of layer1}
--QP1={qp of layer1}
--FramesToBeEncoded=96
--InputBitDepth=8
--InputChromaFormat=420
--DecodingRefreshType=2
--BitstreamFile={bitstream file name}

SHM-12.4 Decoder

TAppDecoder 
-b {bitstream file name}
-o {reconstructed file name}

VTM-21.2 Encoder

EncoderApp
-c encoder_lowdelay_vtm.cfg
-c two-layers.cfg
-l0 --FrameRate={frame rate of layer0}
-l0 --FramesToBeEncoded=96
-l0 --IntraPeriod={intra period of layer0}
-l0 --SourceWidth={width of layer0}
-l0 --SourceHeight={height of layer0}
-l0 --OutputBitDepth=8
-l0 --InputFile={input file name}
-l0 --QP={qp of layer0}
-l1 --FrameRate={frame rate of layer1}
-l1 --FramesToBeEncoded=96
-l1 --IntraPeriod={intra period of layer1}
-l1 --SourceWidth={width of layer1}
-l1 --SourceHeight={height of layer1}
-l1 --OutputBitDepth=8
-l1 --InputFile={input file name}
-l1 --QP={qp of layer1}
--BitstreamFile={bitstream file name}

VTM-21.2 Decoder

DecoderApp 
-b {bitstream file name}
-o {reconstructed file name}

Citation

If you find our work useful for your research, please cite:

@ARTICLE{10521480,
  author={Bian, Yifan and Sheng, Xihua and Li, Li and Liu, Dong},
  journal={IEEE Transactions on Image Processing}, 
  title={LSSVC: A Learned Spatially Scalable Video Coding Scheme}, 
  year={2024},
  volume={33},
  number={},
  pages={3314-3327},
  keywords={Video coding;Encoding;Image coding;Standards;Scalability;Spatial resolution;Static VAr compensators;Learned video coding;spatial scalability;scalable video coding;contextual MV encoder-decoder;hybrid temporal-layer context mining;interlayer prior},
  doi={10.1109/TIP.2024.3395025}}

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
json_results		json_results
output		output
src		src
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
compare_full_IP12.sh		compare_full_IP12.sh
compare_full_IP32.sh		compare_full_IP32.sh
compare_rd_video.py		compare_rd_video.py
recommend_test_config.json		recommend_test_config.json
submit_test.py		submit_test.py
test.py		test.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

LSSVC: A Learned Spatially Scalable Video Coding Scheme

Prerequisites

Build the project

On Linux

Comparing with other method

Pretrained models

Command Lines

simulcast coding

two-layer coding

Citation

About

Releases

Packages

Languages

License

EsakaK/LSSVC

Folders and files

Latest commit

History

Repository files navigation

LSSVC: A Learned Spatially Scalable Video Coding Scheme

Prerequisites

Build the project

On Linux

Comparing with other method

Pretrained models

Command Lines

simulcast coding

two-layer coding

Citation

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages