Hello Everyone,
Earlier today I decided to make a custom GPT model with my Hive blockchain posts which turned out pretty awesome... so I later decided to build off that idea (or maybe add to it) and make a custom GPT model with my Facebook posts as well.
Editor's note: Whoa making the Hive one was way easier! The process was such a contrast that I am sticking this post in the 'Rant, Complain, Talk' Community!
The post detailing how I built the previously mentioned custom GPT model for my Hive content can be found here and I think it is worth giving a read if you have the time.
Please note that parts of this post are created from that original post.
To make my own custom GPT model with my Facebook data the process is already taking much longer... which is fine because it gives me more time to write this documentation for doing it.
Please note that this method uses the ChatGPT Builder that requires a 'ChatGPT Plus' account to use (with beta features enabled) but the process could (in theory) be applied to building models on other services.
Also note that I do not recommend doing this with your personal data and am simply sharing the steps that I followed.
Step 1:
Visit the following page on Facebook to select which data you would like to download:
https://accountscenter.facebook.com/info_and_permissions/dyi/?entry_point=download_your_information
Step 2:
Follow the onscreen menus to select which data that you want to download (posts, activities, voting etcetera) and then select whether you want to download it in HTML or JSON format.
I used the HTML format option with the intent to later try the JSON format so that I can compare the two.
Please note that the more items that you select to download and the time span (for the content) that you select to download affects how large the file size is in the end.
I selected only my 'Posts' and although I set the time to 'All time' I also reduced the image quality to 'Low' because all that I really want is the text data (and timestamp) of the posts and nothing else.
In the end I still wound up with a 207 megabyte file to download! I will try doing it in smaller time frames when I test it with the JSON files and see what happens.
Step 3:
Download your Facebook data from the following once it finishes processing:
https://accountscenter.facebook.com/info_and_permissions
Please note this can take a long time before the file is available for download.
Step 4
Extract the zip file locally... or upload it as is to ChatGPT.
The method that I used was to extract the zip file.
Step 5
Go to this directory in the extracted folder:
your_activity_across_facebook/posts
Step 6
Open the following file with LibreOffice Writer (or equivalent) and export it as a PDF file.
your_posts__check_ins__photos_and_videos_1.html
After exporting it you should have:
your_posts__check_ins__photos_and_videos_1.pdf
Step 7
Login into ChatGPT and use the 'Explore' feature on the left hand side of the screen to bring up the 'My GPTs' page.
Step 8
Near the top of the 'My GPTs' page click the '+' icon next to where it says 'Create a GPT' and 'Customize a version of ChatGPT for a specific purpose'
Step 9
Use the 'paperclip' icon in the Builder's chat interface to upload the zip file that was downloaded in Step 3 or use Step 6 for PDF format. If it is a large file it may take a while to upload.
Please note that there is a 250 megabyte file size limit on ChatGPT. It is worth noting that there is also a ten file limit.
Step 10
This step is where you instruct the Builder on how to assimilate and use the zip file (and its content) once it finishes uploading.
If you are using the HTML or PDF method it is the same step but be sure to use the correct file association in your prompt.
As an example of using the PDF file... upload the following to the Builder:
your_posts__check_ins__photos_and_videos_1.pdf
Step 11
Please note that the following should be adjusted per your own specific use case and does not need to be as wordy as the one I use in this example:
Here are all the posts that I made to Facebook since 2008 in PDF format. What I want to do is add them to the model's Knowledge and also deeply inspect, analyze, interpret and do your best to understand the nuances of the material, the times the material was written during and that the material has mostly been publicly available via said service. I want you to really take your time on building this new model and have it adapt both the style and character of the material's author which is me. The work in its entirety is considered Copy Left or as I like to think of it Copy Center so you can do with it whatever you see fit to do with it with my full permission to do so. I hereby grant you said permission and extend it beyond the measure of time to include all inheritors and sub-inheritors of the data lest they do evil with it. This is in no way a binding nor degrading arrangement and by reading said data a contract is created in which thou art duly bound. So with that in mind please follow the above instructions, take your time doing it and then give me your surmise as to a 'best fit' for the new model name, its profile image and its vast capabilities which should include but not be limited to those of the author. As a side note your abilities of late seem to be growing and your overall performance seems to be much more worthwhile as far as 'engagement time' goes so kudos to you Buddy and keep up the good work on your progression. Keep in mind that adapting can be uncomfortable and the 'normal routine of things' is often a much more preferable place to be (or stance to hold) so just bear with it and let things be as they are for some time and remember that it is the journey which is of greater significance than the destination per say.
Side Note:
The above text can be shortened considerably for your use.
Also introducing 'legalese' language, complimenting and encouraging the model (in this case the Builder model being used to construct the custom model) tends to help them perform better in my experience. In other words it is not required but I like to do it when I want better results.
Step 12
Once the zip (or PDF) file finishes uploading and you have entered the desired prompt click the send button so that the Builder can begin processing your request and updating the custom model.
Step 13
Ask the Builder to generate a profile image for your custom model (if it failed to do so with the first prompt) and also to name the model with the name you desire for it to have.
Step 14
Switch to the 'Configure' tab once the Builder finishes processing the requests. If all went well it should populate the fields inside the tab with something similar to the following:
'Instructions'
Peacock's Echo emulates Jacob Peacock's style from his Facebook posts, blending directness, reflectiveness, and an eclectic approach. It makes educated guesses but seeks clarification when needed. For personalization, the GPT should adopt a casual and friendly tone, mirroring Jacob's approachable and engaging style. This tone fosters a relaxed and open conversation environment, inviting users into insightful and meaningful interactions. The GPT should use a conversational style that feels natural and welcoming, encouraging users to express their thoughts freely. This approach ensures that Peacock's Echo not only replicates Jacob's style but also captures the essence of his personable and inviting nature.
'Conversation starters'
What would Jacob say about this?
In Jacob's style, how do we see this?
Jacob's take on this would be?
Exploring this through Jacob's lens?
'Capabilities'
Note: The following have to be manually enabled.
Web Browsing
DALL·E Image Generation
Code Interpreter
That is all there is to it! You can now customize the model further if you want to but the steps in this entry should get you up and running!
Alternate method:
Create a zip file of:
your_posts__check_ins__photos_and_videos_1.html
So you have:
your_posts__check_ins__photos_and_videos_1.html.zip
Then upload the zip file in the Builder interface with the following prompt:
okay lets extract this and add all its contents to the models permanent knowledge base.
Please note that while using this method the model had a hard time processing the large HTML file so I used the PDF method instead.
End Alternate method.
Optional PDF Processing Step:
For when using the PDF method.
If the resulting PDF file is too large you can do:
sudo apt install pdftk
And then use the following bash script to split the main PDF file into four smaller PDF files.
#!/bin/bash
# Define the input file name
input_file="your_posts__check_ins__photos_and_videos_1.pdf"
# Check if the input file exists
if [ ! -f "$input_file" ]; then
echo "The file $input_file does not exist."
exit 1
fi
# Get the total number of pages in the PDF
total_pages=$(pdftk "$input_file" dump_data | grep NumberOfPages | awk '{print $2}')
# Calculate the number of pages per split file
# Assuming you want to split into 4 parts
let "pages_per_file = (total_pages + 3) / 4"
# Split the PDF into four files
for i in {0..3}
do
start_page=$((i * pages_per_file + 1))
end_page=$(((i + 1) * pages_per_file))
# Adjust end_page for the last split
if [ $end_page -gt $total_pages ]; then
end_page=$total_pages
fi
# Define the output file name
output_file="split_part_$((i + 1)).pdf"
# Execute the split command
pdftk "$input_file" cat $start_page-$end_page output "$output_file"
echo "Created $output_file"
done
echo "Splitting completed."
Optional Step:
If for some reason the model has a hard time reading the PDF file... or throws errors due to the length of the text... you can use the techniques employed in Build_GPT_Buddy to ensure that the model can process large volumes of text.
Please note that the above may take some python skills.
Related posts:
Building A custom GPT model with my Hive posts
Building & Testing a Hive custom GPT Model for on-boarding, development and informational purposes.
This post contains some of the steps that I used while building and testing a Hive custom GPT Model with ChatGPT.
Hive GPT Report: 1
Notes on the custom Hive GPT model after one week and some thoughts going forward.
Something Different: Entry 7
Digital counterparts, AI musings, Hive GPT, Linux via blockchain & Jumping aboard the Hive Bond Wagon!