As the hunt continued, we learned a great deal about how audio files worked especially with the containers and streams. Thanks to our mentor, Dr. Raja Kushalnagar and another co-worker on the project team for supplementing us with the information necessary to make our project successful with sampling audio files. When I found a sample folder of various audio files, I couldn’t believe it when I saw that the file size was 34 gigabytes! Yes, thirty-four gigabytes worth of audio all in one folder! I was shocked to see that. I then realized we should start categorizing our audio files into three categories: Short, Medium, and Long.
In the short category, it would consist of audio files that have less than 1 minute in length. Medium: 1 minute - 5 minutes; Long: Longer than 5 minutes. Even in a sample audio folder with 34 GB, it wasn’t difficult finding audio files that were less than a minute in length. I even had a greater difficulty finding both medium and long audio files in open source samples online. Thankfully, one of the co-workers on the team said that there might be a way for him to get internal audio files from previous projects at Gallaudet University, so he had to ask for permission, and it was granted. In these audio files, the length was great, just exactly what we needed for our projects. The audio files itself was set up as a single file with two microphone inputs. So with this, we still needed a way to split it into mono channels, so we learned how to use audacity and split the audio files into mono channels, and now the audio file samples were perfect for our development testing purposes.
It’s basically WebRTC with easy implementations and setup, so we didn’t have to code everything from scratch to get working video and audio. This is especially important as we only have 10 weeks in our internship, we cannot waste any time building anything from scratch, but to use open source stuff, and modify it to meet our needs. So with easyRTC set up, we started looking into modifying the code to meet what we want for the project. Unfortunately, I’ve had tough luck getting some things to work properly under a nodejs environment like the CSS and static files. I have a basic knowledge of how nodejs works, so to no luck to get things to work like we wanted to see. We started to shift our focus onto the WebRTC platform and work from there since some static files were able to be loaded via the webrtc platform. This was a wall that I faced for 2 days trying to get nodejs + expressjs + css all to work together. Unfortunately, it didn’t all work out. Gratefully, we are still able to do it all successfully with the webrtc platform.
Streams + transcripts
Our next focus was getting streams up and running. The problem we had to solve was the fact that the stream wasn’t necessarily live in order to have a great controlled experiment with all ASRs using the same video/audio files and see the results of each ASR engine, and gather data and results. Emelia, my co-intern spent most of her focus on getting the virtual webcam stream up and running. Whereas I spent my focus on getting a transcript page up and running to show the text being translated from the ASR engine. With all that being successful after a couple days, we started to study the ASRs and their nodejs counterparts and see how we could get it to work seamlessly with the WebRTC platform. After studying and understanding some of the implementations, Emelia worked on getting some nodejs scripts work properly in a CLI environment with the sample audio file whereas I continued to work on the transcript webpage and modify some things. Towards the end of the week, we started to shift our focuses to IBMC and GCP since MicroSoft Azure (AZ) had it working out of the box for what we needed to see in a console environment. IBMC and GCP didn’t have it working out of the box, but we are pretty sure they have the features in their API. We just needed to figure out how to get it to work similarly to the AZ console environment. We are going to resume this focus next week with GCP and IBMC environments.