This is an idea for a video compression algorithm. Every type of video compression algorithm is best for a specific kind of video source so this algorithm might be great for some movies and really bad for some others.
I actually wrote an encoder for this to test it out (I think I did it in Java…) and it worked pretty well.
I will try not to use exotic jargon but clearly explain the concept of this algorithm as much as possible without going into the basics.
A movie is basically a set of images.
One approach to compress a movie is to compress the individual images-for example to reduce the number of pixels stored per image, remove some color information etc. Another approach is to compress the movie as a movie for example to remove some of the images or frames. There is obviously also a combination of the two approaches.
When compressing each image one can remove some of the data or convert it to some kind of a formula. Removing data for example is to not save the data for some pixels and when the movie is rendered this pixel is generated by averaging or simulating it in another way from the data that is saved.
Removing data due to formulas can be done with tokenizing for example where a group of repeating pixels is saved as a common “texture” and whenever this “texture” is needed the code just needs to point to it and not replicate the entire pixel information to draw it (zip files use some form of token lists and token pointers for example). Another formula example is some form of vectorizing or saving the mathematical formula that can be used to draw the texture that its data it replaces. The problem with vectorizing is that in many cases the exact formula will be longer than the data it is replacing so a simpler formula is used that only closely matches the pattern and not exactly.
Another example for formula based compression is the simple RLE (run length encoding) where in the simplest for a set of similar pixels that are close to each other their pattern can be only written once with the number of repetitions. For example a instead of writing twenty black pixels as bbbbbbbbbbbbbbbbbbbb they can be written as 20b and use less space without loosing any information.
Let’s assume that there are sections in an image that are exactly the same. In theory you could create a section in the
beginning of the image file and save the needed pixel information for this section plus an identifier for this section. In
every place where we want to instruct the rendering software to draw this pixel pattern we need to specify the location coordinates and the pattern ID. The problem is that this is not very useful for small patterns as the space needed to specify the location of the pattern and the pattern number is the same or less space than would take to simply write the pixel data.
When we encode larger patterns we find less repetitions. This results in many patterns and overall less number of efficient patterns that repeat. When we set an aggressive tolerance level and use large patterns but allow them to be used even if they are only somewhat close to the needed pattern we get much better compression but we can see square artifacts on the image and over all bad image quality.
The same technique is also used for patterns in a movie that can relate to more than one image or frame. Even if the same image section does not repeat in the image it might repeat across other frames in the movie. If for example the background stays constant for a few seconds there are potentially hundreds of individual images with many repeating patterns. The problems are finding large enough patterns that are worth all the extra data that is added to save the patterns and the locations and IDs which can add up to a lot and might actually increase the size of the file if kept in a good quality (means that the saved patters are the same or close to the needed pattern in the source file).
So finally we get to my idea. My idea involves enjoying the savings in replicated pixels across frames without wasting space on patterns and ID when tokenizing repeated pixel data across frames and within frames. The idea also allows for more aggressive loss compression that will blend very well in the movie and will not create noticeable artifacts.
The idea is to separate the movie data into many separate streams. One for each pixel in the movie across all frames. For example in the pixel array pixel[1][1] (assuming we start our array at 1 and not 0) will represent the top left pixel and the data will include all the color information for this pixel for the entire movie.
So, this pixel might have repeating values for any frames that this pixel does not change in. In my encoder I did not
analyze the stream for complex patters but I did do the following:
- Have a unencoded stream with the complete pixel data for analyzing compression ratios.
- Have an RLE encoded stream to show basic RLE which gave good results in many movies.
- Added RLE tolerance in which if pixels are close enough according to some set threshold they will be encoded as the
same or follow some predefined pattern of becoming softer or stronger across a run of frames. For example if in regular RLE 20 20 20 20 20 20 20 20 can be saved as 8×20 than with a threshold of 2 also 19 18 20 19 19 18 20 20 can be saved as 8×19 instead to keeping the exact values. Also 12 13 14 14 14 15 16 16 16 18 19 can be saved as 11,12-19 to create an increasing pattern between 12 and 19 over 11 pixels. This is not exact replication but might be close enough. A closer matching patter will require more detailed definitions for the formula. The movie can be analyzed with genetic algorithms for curve fitting to find the most efficient formulas but it would take a long time to encode).
- Added optional channel RLE for RGB specific channel changes that might be effective for specific movies.
- Added optional (random?) skip of runs of pixels where the pixels should be simulated from surrounding pixels. As each pixel is encoded individually the chances that all 9 pixels around are also simulated are very small unless the compression set is very aggressive. Results are very good and it can generate very small sizes.
Some disadvantages for this movie codec are the memory and processing time that might be needed for high resolution videos. Worked surprisingly well and was easy to create the encoder. I did not make the decoder and I can help anyone who might want to try.