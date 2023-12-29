We all have a folder full of images whose filenames resemble line noise. How about renaming those images with the help of a local LLM (large language model) executable on the command line? All that and more is showcased on [Justine Tunney]’s bash one-liners for LLMs, a showcase aimed at giving folks ideas and guidance on using a local (and private) LLM to do actual, useful work.
This is built out from the recent llamafile project, which turns LLMs into single-file executables. This not only makes them more portable and easier to distribute, but the executables are perfectly capable of being called from the command line and sending to standard output like any other UNIX tool. It’s simpler to version control the embedded LLM weights (and therefore their behavior) when it’s all part of the same file as well.
One such tool (the multi-modal LLaVA) is capable of interpreting image content. As an example, we can point it to a local image of the Jolly Wrencher logo using the following command:
llava-v1.5-7b-q4-main.llamafile --image logo.jpg --temp 0 -e -p '### User: The image has...\n### Assistant:'
Which produces the following response:
The image has a black background with a white skull and crossbones symbol.
With a different prompt (“What do you see?” instead of “The image has…”) the LLM even picks out the wrenches, but one can already see that the right pieces exist to do some useful work.
Check out [Justine]’s rename-pictures.sh script, which cleverly evaluates image filenames. If an image’s given filename already looks like readable English (also a job for a local LLM) the image is left alone. Otherwise, the picture is fed to an LLM whose output guides the generation of a new short and descriptive English filename in lowercase, with underscores for spaces.
What about the fact that LLM output isn’t entirely predictable? That’s easy to deal with. [Justine] suggests always calling these tools with the
--temp 0 parameter. Setting the temperature to zero makes the model deterministic, ensuring that a same input always yields the same output.
There’s more neat examples on the Bash One-Liners for LLMs that demonstrate different ways to use a local LLM that lives in a single-file executable, so be sure to give it a look and see if you get any new ideas. After all, we have previously shown how automating tasks is almost always worth the time invested.
6 thoughts on “Using Local AI On The Command Line To Rename Images (And More)”
How about just thumbnails?
My thumbs don’t take kindly to being renamed.
That is why he wants only nails. Not whole thumbs.
You can process pictures with ML , rename them, generate thumbnails after that,
or you can write script capable of searching for same name and rename all instances of that name.
regex + GREP (ripgrep) can do a lot for your searches.
Yes, but put this data into EXIF annotation inside of a picture file itself, it is widely used standard, there exists a lot of tools to view / sort images based on this data, so for example you want to view all images with COW on it, you just open your favorite picture viewing program and it will filter them based on your query. Even console tools can search / sort based on EXIF data.
I am saving names of windows visible in screenshots in EXIF,
also if i generate pictures for some project then project / client name is also embedded in EXIF, it make life so much easier when client calls you that he wants to know something about old project and you need to find something 10 years old… Pictures are not graphical creative something, they are photos of devices, schemas, wirings, water damage for insurance company, etc
OR
as apple does it with PHOTOS app, they put all this metadata into SQLite database after ondevice ML processes your pictures after your APPLE device detects that it is connected to charger.
ref: https://simonwillison.net/2020/May/21/dogsheep-photos/
i use ondevice ML also for transcription of company calls, videos can have subtitles / transcription searchable with console tools too.
One could add extra step with the LLM to convert the description to a more concise filename.
For example:
User: Generate short, useful filename for image described as: “The image has a black background with a white skull and crossbones symbol.”
Assistant: SkullCrossbones_BlackWhiteBG.jpg
