Today’s menu, remaking the YouTube Layout Timelapse video. I felt the original was a bit incomplete and the process of which I took the screenshots was very ghetto, so I dedicated this afternoon to writing a quick script in Python and sent it running.
Taking screenshots manually was out of the question, so it was time to write a script. But what language would I use? Well, I had just started Status (which I have to write about sometime) and was too rebellious to learn Python, so PHP it was! This is a tale of what not to do.
Opening a web browser and taking a screenshot of it using PHP? It’s both possible and unethical! PHP has a built in implementation of DCE RPC. I’m not going to pretend that I know about it in depth, but PHP’s manual says it can be “used to glue applications and components together on the Windows platform”. Basically, I can open Internet Explorer and get it to navigate to places.
$browser = new COM('InternetExplorer.Application'); $browser->Navigate('http://www.example.com');
Borderline demonic! Not only that, but I can also take screenshots of windows via PHP!
$handle = $Browser->HWND; $img = imagegrabwindow($handle, 0);
Why a language designed for HTML templating would ever need a “take a screenshot of a window” function is beyond me, but it exists. I used these two great features, along with some datetime stuff, to cycle through Wayback Machine snapshots. Knowing how Wayback Machine URLs were formatted, it was easy to throw all of this into a loop and add a week to the timestamp each time.
You’d probably want to wait until the page is done loading before taking a screenshot of it, right? Me too! Unfortunitely, I was unable to find a solid way to make the application wait until Internet Explorer was done loading, so sleep() it was! The Internet Archive has always been notoriously slow, and couple that with Internet Explorer having trouble rendering as web standards progressed, I was looking at a good 40 seconds per page.
After two days of giving up my computer and troubleshooting frequently, I finally had a folder of images to compile. I threw them all into Vegas Pro, rendered, and threw it up onto VidLii. Not much else to say.
Python! There’s a nice package called Selenium (Aclevo reference?) that automates web browsers. All I needed to do was familiarize myself with the docs, write up a quick script, and leave it running. Of course, things are never that simple. The documentation was strangely outdated, as a lot of the example code threw deprication warnings, but I eventually landed with this:
from selenium import webdriver from selenium.webdriver.firefox.options import Options import datetime options = Options() options.add_argument("--headless") # No window. options.add_argument("--kiosk") # No title bar. options.add_argument("-p default-release") # Using my Firefox profile for plugins. options.page_load_strategy = 'normal' # Wait until fully loaded. driver = webdriver.Firefox(options=options) driver.set_window_size(1280, 720) snapshot = datetime.datetime(2005, 4, 28) # Start date. while True: # Am lazy, will just stop it manually. timestamp = snapshot.strftime("%Y%m%d%H%M%S") driver.get('http://web.archive.org/web/' + timestamp + 'if_/http://www.youtube.com/') driver.get_screenshot_as_file(timestamp + ".png") snapshot = snapshot + datetime.timedelta(days=7) print(timestamp) driver.quit()
I had to stop the script in around 2017 because that’s when YouTube put up their new layout which broke the Wayback Machine. Looking at the metadata, it took about 5 hours to get all the screenshots. Not too shabby! All that was left was to stich them together, and I used FFmpeg to do it. (Me? FFmpeg? Never!)
ffmpeg -r 15 -f concat -safe 0 -i mylist.txt -c:v libx264 -pix_fmt yuv420p "output.mp4"
Fifteen frames a second. Fed a big list of filenames into FFmpeg and it stiched them together all right. You can read up on concatenate here. Quite useful. I had to disable safe mode(?) because FFmpeg was not happy about filenames having slashes or something. Dumb.
Here’s the final result, uploaded to VidLii for old times sake. Keeping it unlisted because I’m not entirely happy with it. A few duplicate screenshots, Wayback Machine errors, uBlock Origin either leaving the ads completely intact or leaving a giant white space in place of it. Oh well.
Twas an exercise in automation.