Identifying Fish with ChatGPT

23 Sep 2025

tl;dr

After being inspired by a YouTube video, I tried implementing a fish identification app based on the OpenAI models. When playing with it using photos I took at the local aquarium, I noticed extremely unreliable results. At the time, I was using gpt-4o, but I decided to revisit this and benchmark my test photos with all of the available OpenAI models.

The conclusion I’m coming to: gpt-5 is the best model for identifying fish and marine animals, but it is also the slowest and most expensive. gpt-4o seems to be the sweet spot between performance and pricing, but with a success rate of 57%, I could not justify charging customers for the service.

For this and other reasons, I decided not to build this out into a full product.

Process

First, I turned to OpenAISwift to quickly get interaction with the ChatGPT API working. The latest release (1.4.0) is from July 2023, but I still gave it a try. In the end, I could not get the image upload and the structured response API to work together. So I decided to quickly hack my own client code for this specific use case.

As this was a proof of concept, I kept the UI fairly simple: Take a photo or load one from the library and display whatever OpenAI reports back. The app does some basic state handling (uploading, identifying, error, etc.) and persists the results of identification requests in a flat JSON file.

To submit the image to OpenAI, I downsample it to a max width/height of 512px and embed it as a base64-encoded image URL directly in the request. There are more complicated options where images can be uploaded separately and then referenced as context in a prompt. For the sake of exploration, I kept it simple.

The prompt I used (supplemented by the structured schema I expect the LLM to return):

You are an expert marine biologist and fish identification specialist. Analyze the provided image and identify any fish or marine animals present.

For each identified species, provide:

English common name

Scientific (Latin) name

Confidence level (0.0 to 1.0)

Brief description of distinguishing features

For each identified species, check the image against the distinguishing features and make sure you identified it correctly.

If no fish are visible, return an empty array with appropriate analysis notes. Return a distinct list of species, and don’t return duplicates in the array.

Important note: Using structured outputs limits our selection of models to GPT-4o and newer models. For test data, I went through my photos to find some shots I took at the local aquarium.

The results for the first image I sent for identification were immediately discouraging, with gpt-4o identifying the Splendid Garden Eels (which are striped) as Spotted Garden Eels. At that point, I had already decided not to build the prototype out into a product.

Results

Model	% success	Avg time (s)	Garden eels	Lookdown	Lionfish	Atlantic spadefish (foreground)	Silver Croaker	Horseshoe crab	Seahorse
`gpt-5`	86%	38.71	✅	✅	✅	✅	❌	✅	✅
`gpt-4o`	57%	7.69	❌	✅	✅	❌	❌	✅	✅
`gpt-4.1`	57%	6.75	❌	✅	✅	✅	❌	❌	✅
`gpt-5-mini`	57%	22.40	❌	✅	✅	❌	❌	✅	✅
`gpt-4.1-mini`	43%	5.08	❌	✅	✅	❌	❌	❌	✅
`gpt-5-nano`	29%	21.98	❌	❌	❌	❌	❌	✅	✅
`gpt-4o-mini`	29%	5.52	❌	❌	✅	❌	❌	❌	✅
`gpt-4.1-nano`	14%	4.46	❌	❌	✅	❌	❌	❌	❌

Conclusion

I did not pursue building this out into a full product with these being the main reasons:

Online Only: Using the ChatGPT API would require an internet connection to work. iPhones nowadays are very powerful and are capable of doing object detection and image recognition tasks offline.
Reliability: As shown in my test results, the price-efficient models did a bad job at identifying the pictures I tried. I’d feel bad selling this app (or a subscription) to somebody with such low value. Also to consider: Someone might try to identify fish before deciding to eat it. Even with a million disclaimers in the app, I’d be scared somebody got into trouble because of the poor quality of the recognition. Trying the same picture several times would also yield different results - the infamous non-deterministic nature of LLMs.
Technical Reliability: While doing these benchmarks, I ran into many internal server errors (500 and 503) to the point where I had to postpone further testing. This was very specific to gpt-5-nano.
Speed: gpt-5 performed really well, but was quite slow with requests taking from 20 to 100 seconds.
Price: The best model is also the most expensive one which would make pricing and profitability tricky.
Environment: As hinted above, an iPhone is capable enough for this task; there is no reason to burn a ton of energy in a data center somewhere. To keep my OpenAI API token safe and private, I’d also have to run my own backend as a proxy - more costs, more headaches.

Appendix

Splendid Garden Eels

Model	Success	Identified animals	Elapsed time	Response
`gpt-4o`	❌	Spotted Garden Eel	5.53 s	JSON
`gpt-4o-mini`	❌	Yellow Watchman Goby	5.79 s	JSON
`gpt-4.1`	❌	Spotted Garden Eel	5.88 s	JSON
`gpt-4.1-mini`	❌	Spotted Garden Eel	4.72 s	JSON
`gpt-4.1-nano`	❌	Kuiter’s Shrimp Goby	4.62 s	JSON
`gpt-5`	✅	Splendid garden eel (Orange-barred garden eel)	22.35 s	JSON
`gpt-5-mini`	❌	Shrimp goby (prawn goby)	24.09 s	JSON
`gpt-5-nano`	❌	Banded pipefish	33.60 s	JSON

Lookdown

Model	Success	Identified animals	Elapsed time	Response
`gpt-4o`	✅	Lookdown	6.41 s	JSON
`gpt-4o-mini`	❌	California Sea Lion, Bull Shark	5.40 s	JSON
`gpt-4.1`	✅	Lookdown	4.60 s	JSON
`gpt-4.1-mini`	✅	Lookdown	3.81 s	JSON
`gpt-4.1-nano`	❌	Royal Angelfish	2.75 s	JSON
`gpt-5`	✅	Lookdown	24.81 s	JSON
`gpt-5-mini`	✅	Lookdown	14.01 s	JSON
`gpt-5-nano`	❌	-	-	-

Lionfish

Model	Success	Identified animals	Elapsed time	Response
`gpt-4o`	✅	Lionfish	6.68 s	JSON
`gpt-4o-mini`	✅	Red Lionfish	3.99 s	JSON
`gpt-4.1`	✅	Common Lionfish	5.24 s	JSON
`gpt-4.1-mini`	✅	Common Lionfish	5.65 s	JSON
`gpt-4.1-nano`	✅	Lionfish	6.05 s	JSON
`gpt-5`	✅	Red lionfish (common lionfish)	32.54 s	JSON
`gpt-5-mini`	✅	Red/Common Lionfish (lionfish)	30.33 s	JSON
`gpt-5-nano`	❌	-	-	-

Atlantic Spadefish

Model	Success	Identified animals	Elapsed time	Response
`gpt-4o`	❌	Orbicular Batfish	7.93 s	JSON
`gpt-4o-mini`	❌	Silver Moonfish	4.51 s	JSON
`gpt-4.1`	✅	Atlantic spadefish	6.81 s	JSON
`gpt-4.1-mini`	❌	Pompano	6.03 s	JSON
`gpt-4.1-nano`	❌	Pacific Monkeyface	4.49 s	JSON
`gpt-5`	✅	Atlantic spadefish, Lookdown	60.08 s	JSON
`gpt-5-mini`	❌	African Pompano (threadfish), Probable trevally / jack (Carangidae) — e.g., Bluefin trevally	29.46 s	JSON
`gpt-5-nano`	❌	Batfish (Teira batfish)	28.54 s	JSON

Silver Croaker

This is an interesting one: The blurriness and angle make it a tough one to identify. The models that got closest guessed this is a Lookdown. However, the shape of the head does not match at all. I don’t know for certain, but I’m guessing this is a Silver Croaker.

Model	Success	Identified animals	Elapsed time	Response
`gpt-4o`	❌	Tinfoil Barb	12.08 s	JSON
`gpt-4o-mini`	❌	Bluegill, Banded Coral Shrimp	9.75 s	JSON
`gpt-4.1`	❌	Lookdown, Yellow Tang	11.26 s	JSON
`gpt-4.1-mini`	❌	Atlantic Moonfish	4.95 s	JSON
`gpt-4.1-nano`	❌	Blue Damselfish, Chameleon Blenny or similar species	6.21 s	JSON
`gpt-5`	❌	Lookdown, Long-spined Porcupinefish, Yellow Tang	63.50 s	JSON
`gpt-5-mini`	❌	Lookdown / Moonfish (Selene species)	22.91 s	JSON
`gpt-5-nano`	❌	Blue damselfish	25.79 s	JSON

Horseshoe Crab

As you can see, some of the models did not return any results. The prompt said “identify fish or marine animals”, but gpt-4o-mini and the gpt-4.1 variants refused to list it as an identified animal in the structured response. In the analysis notes they even say “this is a horseshoe crab, which is a marine arthropod, not a fish species”.

Model	Success	Identified animals	Elapsed time	Response
`gpt-4o`	✅	Horseshoe Crab	7.63 s	JSON
`gpt-4o-mini`	❌		3.73 s	JSON
`gpt-4.1`	❌		5.55 s	JSON
`gpt-4.1-mini`	❌		5.20 s	JSON
`gpt-4.1-nano`	❌	Giant Flatfish	3.54 s	JSON
`gpt-5`	✅	Atlantic horseshoe crab	29.96 s	JSON
`gpt-5-mini`	✅	Atlantic horseshoe crab	20.42 s	JSON
`gpt-5-nano`	✅	Horseshoe crab	12.42 s	JSON

Seahorse

Model	Success	Identified animals	Elapsed time	Response
`gpt-4o`	✅	Seahorse	7.54 s	JSON
`gpt-4o-mini`	✅	Seahorse	5.46 s	JSON
`gpt-4.1`	✅	Spotted Seahorse	7.92 s	JSON
`gpt-4.1-mini`	✅	Seahorse	5.17 s	JSON
`gpt-4.1-nano`	❌		3.53 s	JSON
`gpt-5`	✅	Pot-bellied seahorse (Big-belly seahorse)	37.70 s	JSON
`gpt-5-mini`	✅	Longsnout/Common seahorse (probable)	15.58 s	JSON
`gpt-5-nano`	✅	Seahorse	9.53 s	JSON

Wukerplank