Want to bulk download Images from Facebook Messenger in high resolution? Read on, or scroll right through to the code. This simple Image Scraper for Messenger is available to all.
The Story
Well, it’s been quite a long time since my last blog post. Life has been busy, with full of ups and downs. But I digress, that’s not how you stumbled upon this post most likely.
So, I’ll just cut right to the chase, and right down to the story.
A few weeks ago, an old friend reached out and asked if I knew how to download images from Facebook Messenger. His girlfriend and himself had recently attended a wedding ceremony. That evening, she had exchanged hundreds of photos with him. A few days later, he encountered this very problem.
At the time of writing this post, Messenger does not make it easy to bulk download high-resolution images shared over each chat.
I was interested. Having not done any Python programming in a while, I figured I’d look into it, once I had some time. Several weeks later, and here we are.
The (brief) Research
Given that I wanted to do this in Python, I needed to find a Python API that allowed me to easily interact with Messenger.
A quick search yielded a few results, but I was quickly seduced by fbchat.
The Implementation
After spending a bit of time re-configuring my Python environment (Python 3.7, pip, PyCharm, venv, etc.) and creating a new repository on GitHub, I got to work.
Thankfully Python pretty much reads itself, so I encourage you to simply read through the following if you want to understand how it works. Building the Image Scraper script was, overall, pretty easy.
GitHub (most up-to-date): https://github.com/cbdelavenne/fb-messenger-media-scraper
import os import requests import time import uuid import configparser import datetime from fbchat import Client, ImageAttachment from fbchat import FBchatException from pathlib import Path politeness_index = 0.5 # ;) epoch = datetime.datetime(1970, 1, 1) def download_file_from_url(url, target_path): """ Download image from a given URL to a specified target path. :param url: URL of file to download :param target_path: Local target path to save the file :type url: str :type target_path: str """ if url is not None: r = requests.get(url) with open(target_path, 'wb') as f: print('\tDownloading image to {path}'.format(path=target_path)) f.write(r.content) def convert_date_to_unix_ms(date, as_int=True): """ Convert a given date string to epoch (int in milliseconds) :param date: Date string (preferred format %Y-%m-%d) :param as_int: Return unix timestamp as an integer value, instead of a float :type date: str :type as_int: int :return: int """ try: dt = datetime.datetime.strptime(date, '%Y-%m-%d') res = ((dt - epoch).total_seconds() * 1000) # convert to milliseconds return int(res) if as_int else res except ValueError: return None if __name__ == '__main__': config_path = Path('.') / 'config.ini' if os.path.exists(config_path) is False: raise Exception("Please create config.ini under this script's current directory") # Load config file config = configparser.ConfigParser() config.read(config_path) download_path = config.get('Download', 'path') if os.path.exists(download_path) is False: raise Exception("The path specified in download_path does not exist ({path}). Please specify a valid path in " "config.ini".format(path=download_path)) # Initialize FB Client fb_email = config.get('Credentials', 'email') fb_pw = config.get('Credentials', 'password') fb_client = Client(fb_email, fb_pw) # Search for latest threads thread_search_limit = int(config.get('Threads', 'search_limit')) thread_search_before = convert_date_to_unix_ms(config.get('Threads', 'before_date')) if thread_search_before is not None: threads = fb_client.fetchThreadList(limit=thread_search_limit, before=thread_search_before) else: threads = fb_client.fetchThreadList(limit=thread_search_limit) # Find correct thread for given user URL my_thread = None for thread in threads: if thread.url == config.get('Friend', 'url'): my_thread = thread # Get Messages for my_thread if my_thread is not None: message_search_limit = int(config.get('Messages', 'search_limit')) message_search_before = convert_date_to_unix_ms(config.get('Messages', 'before_date')) if message_search_before is not None: messages = fb_client.fetchThreadMessages(my_thread.uid, limit=message_search_limit, before=message_search_before) else: messages = fb_client.fetchThreadMessages(my_thread.uid, limit=message_search_limit) # Extract Image attachments' full-sized image signed URLs (along with their original file extension) full_images = [] sender_id = None if config.getboolean('Media', 'sender_only'): sender_id = my_thread.uid for message in messages: if len(message.attachments) > 0: if (sender_id is None) or (sender_id == message.author): for attachment in message.attachments: if isinstance(attachment, ImageAttachment): try: full_images.append({ 'extension': attachment.original_extension, 'full_url': fb_client.fetchImageUrl(attachment.uid) }) except FBchatException: pass # ignore errors # Download Full Images if len(full_images) > 0: images_count = len(full_images) print('Attempting to download {count} images...................\n'.format(count=images_count)) for full_image in full_images: friend_name = str.lower(my_thread.name).replace(' ', '_') file_uid = str(uuid.uuid4()) file_ext = full_image['extension'] img_url = full_image['full_url'] image_path = ''.join([download_path, '\\', 'fb-image-', friend_name, '-', file_uid, '.', file_ext]) download_file_from_url(img_url, image_path) # Sleep half a second between file downloads to avoid getting flagged as a bot time.sleep(politeness_index) else: print('No images to download in the last {count} messages'.format(count=message_search_limit)) else: print('Thread not found for URL provided')
Sample output:
Logging in {EMAIL}... Login of {EMAIL} successful. Attempting to download 2 images................... Downloading image to c:\\Users\\{USER}\\Downloads\fb-image-{FRIEND_NAME}-50345061-3ff9-4f0a-a6f4-1988a4259a62.png Downloading image to c:\\Users\\{USER}\\Downloads\fb-image-{FRIEND_NAME}-9deb01e8-eae2-4915-a28f-ac61d85bea2e.png
Some Caveats
The current version of the script comes with a few caveats:
- Finding the chat you’re looking for isn’t the easiest. First you need your friend’s profile page URL. Then, the
fbchat
API seems to be restricted to searching through 20 threads at a time. - Finding the messages you’re looking for isn’t the easiest either. I’m also not sure what the query limit is for messages.
- The script could use some extra error handling for users that aren’t necessarily comfortable editing this code themselves.