AWS Chime Video Reconstruction

AWS Chime Video Reconstruction

Software Development
Tips & Tricks
April 8, 2024
AWS Chime is a popular video conferencing framework from Amazon. Similar to Google Meets, Zoom, or Teams, it allows a group of people to connect and share video and audio, along with screens, for a group discussion. Unlike those others, it also has a very robust API that is freely available to anyone using AWS. That makes it a popular choice for application developers, along with Vonage or Twilio.
AWS Chime support recording the session via MediaPipelines, but unfortunately the resulting “recording” gets dumped in to directory with a few bits:
  • audio - The Audio portions of the conversation, fully mixed.
  • video - The vidio portions of the meeting, with no audio.
  • meeting-events - a collection of JSON files containing discrete events of the meeting, including starting, stopping, change of active speaker, and many more.
  • data-channel - a collection of JSON files of all the data-events in the meeting, which can include lots of things based on how the data-channel is used.
This can make reconstructing the meeting a bit tricky. You have to mix the audio and video, which may not align. Amazon has an example github repo showing a way to do it automatically at the conclusion of a meeting via some lambda code here :
aws-samplesUpdated Apr 6, 2024
But I wanted something a bit simpler and more local, so that I could reconstruct specific meetings on my local workstation. And so I wrote a simple python script called “chime_reconstruct” and published it here:
YerazeUpdated Apr 1, 2024
Per the instructions in the readme, basically:
  • aws s3 cp --recursive the entire video recording locally
  • Drop this python script in the resulting directory
  • Run it, and get an output.mp4
It reconstructs the video and audio.. if there are missing or corrupt video chunks, you will get black frames in those sections. In addition, it places subtitles for any special events that happen during the meeting (except for change of speaker).
So far it’s working in my limited use. I’m sure there are gaps and special cases that it won’t handle that I haven’t run into yet, so feel free to revise it to your own needs. i’ve attempted to include comments and helpful guidelines in the code to make it a bit easier (and chatGPT friendly 🙂 ).