Splitting Video With Ffmpeg and Python
Splitting Video With Ffmpeg and Python
(https://twitter.com/yohanesgultom)
(https://www.linkedin.com/in/yohanesgultom/)
(mailto:yohanes.gultom@gmail.com)
MENU
I had a project to build a simple website that split uploaded video into parts that have same
duration (except the last one if the division has remainder). Almost everyone in the internet
suggest ffmpeg (https://www.ffmpeg.org/) which is so far considered the best open-source
swiss-army knife for video manipulation.
After hours of browsing and trial-and-errors, I found best 2 solutions to do this with
ffmpeg . Each solution has its own advantage/disadvantage. I used Python to glue the
whole process because I felt it’s the best choice for this kind of task.
The first solution, which is suggested by a helpful Stack Overflow answer
(https://stackoverflow.com/a/28884437/1862500), uses this command as the base:
For example, this command below will extract video from 10th second to 15th second (5
seconds duration):
Most references in the internet suggests using -c copy option which will not re-encode
the video so the process will be much faster. But the trade off is the duration of the
extracted video will not always exactly what we want (1-5 seconds miss). On the other hand,
while the method above (without -c copy ) is very slow (due to re-encoding), the result’s
duration is very precise.
1. import re
2. import math
3. from subprocess import check_call, PIPE, Popen
4. import shlex
5.
6. re_metadata = re.compile('Duration: (\d{2}):(\d{2}):(\d{2})\.\d+,.*\n.* (\d+(\.\d+)?)
7.
8. def get_metadata(filename):
9. '''
10. Get video metadata using ffmpeg
11. '''
12. p1 = Popen(["ffmpeg", "-hide_banner", "-i", filename], stderr=PIPE, universal_newl
13. output = p1.communicate()[1]
14. matches = re_metadata.search(output)
15. if matches:
16. video_length = int(matches.group(1)) * 3600 + int(matches.group(2)) * 60 + int
17. video_fps = float(matches.group(4))
18. # print('video_length = {}\nvideo_fps = {}'.format(video_length, video_fps))
19. else:
20. raise Exception("Can't parse required metadata")
21. return video_length, video_fps
22.
23. def split_cut(filename, n, by='size'):
24. '''
25. Split video by cutting and re-encoding: accurate but very slow
26. Adding "-c copy" speed up the process but causes imprecise chunk durations
27. Reference: https://stackoverflow.com/a/28884437/1862500
28. '''
29. assert n > 0
30. assert by in ['size', 'count']
31. split_size = n if by == 'size' else None
32. split_count = n if by == 'count' else None
33.
34. # parse meta data
35. video_length, video_fps = get_metadata(filename)
36.
37. # calculate split_count
38. if split_size:
39. split_count = math.ceil(video_length / split_size)
40. if split_count == 1:
41. raise Exception("Video length is less than the target split_size.")
42. else: #split_count
43. split_size = round(video_length / split_count)
44.
45. output = []
46. for i in range(split_count):
47. split_start = split_size * i
48. pth, ext = filename.rsplit(".", 1)
49. output_path = '{}-{}.{}'.format(pth, i+1, ext)
50. cmd = 'ffmpeg -hide_banner -loglevel panic -ss {} -t {} -i "{}" -y "{}"'.forma
51. split_start,
52. split_size,
53. filename,
54. output_path
55. )
56. # print(cmd)
57. check_call(shlex.split(cmd), universal_newlines=True)
58. output.append(output_path)
59. return output
The idea is just to calculate the exact start time of each video chunks and call the ffmpeg
few times according to the number of chunks.
As you may have noticed, this command use -c copy that make it very fast. On top of that,
this command is already splits the video into multiple parts so we don’t need to call it more
that one time using Python. The catch is, just like what happens with first solution if -c
copy option used, the chunks may not always have exact duration as we defined. Most
likely this is because the cut can only be done on key frames when no re-encoding is done.
Even when this method is already trying to “force” key frames on each expected cut, the
result isn’t so precise.
4. Reference https://medium.com/@taylorjdawson/splitting-a-video-with-ffmpeg-the-grea
5. '''
6 assert n > 0
6. assert n > 0
7. assert by in ['size', 'count']
8. split_size = n if by == 'size' else None
9. split_count = n if by == 'count' else None
10.
11. # parse meta data
12. video_length, video_fps = get_metadata(filename)
13.
14. # calculate split_count
15. if split_size:
16. split_count = math.ceil(video_length / split_size)
17. if split_count == 1:
18. raise Exception("Video length is less than the target split_size.")
19. else: #split_count
20. split_size = round(video_length / split_count)
21.
22. pth, ext = filename.rsplit(".", 1)
23. cmd = 'ffmpeg -hide_banner -loglevel panic -i "{}" -c copy -map 0 -segment_time {}
24. check_call(shlex.split(cmd), universal_newlines=True)
25.
26. # return list of output (index start from 0)
27. return ['{}-{}.{}'.format(pth, i, ext) for i in range(split_count)]
Since the method handles the splitting internally, we just need to compute proper
parameters to be passed into the command. But the only extra computation needed
compared to the first method is the computation for the frame_group =
round(split_size*video_fps) .
Conclusions
So there are two solutions that I found and used in my last video splitting project. The first
one is precise but slow, while the second one is fast but not precise. I decided to
implemented both and allow user to choose what they need. I hope this post help anyone
working on similar project.
I’m still very inexperienced in this video manipulation business. So if you know a better
solution, please let me know in the comment. I’d be very interested to test it.
Cheers!
Edit (https://yohanes.gultom.me/wp-admin/post.php?post=386&action=edit)
Leave a Reply
You are logged in as yohanes.gultom@gmail.com (https://yohanes.gultom.me/author/yohanes-
gultomgmail-com/) | Log out (https://yohanes.gultom.me/wp-login.php?
action=logout&redirect_to=https%3A%2F%2Fyohanes.gultom.me%2F2020%2F04%2F03%2Fsplitting-
video-with-ffmpeg-and-python%2F&_wpnonce=1a3666db85)
Subscribe