Vision–Language Model Challenge
Challenge Overview
You are given images representing the final state of a completed Cascadia board game played by three players.
Your task is to determine whether a Vision–Language Model (VLM) system can correctly understand the game state and compute final scores based on the official scoring rules.
Task
Using the provided images as input, answer the following:
- How many of each animal token is present for each player
- What is the score associated with each animal type
- The final total score for each player
- Who won the game
Bonus (optional):
- Identify and count habitat tiles
- Reason about habitat-based scoring if applicable
Inputs
Assume your inputs are:
- Images showing the final board positions of all three players
- Images of the wildlife scoring cards used in the game
These images represent the complete and only source of truth.
.png)
Scoring Rules Reference
The following wildlife scoring cards were used to calculate victory points in this game.

You may refer to publicly available resources to understand the rules of Cascadia:
- Written rules:
https://whatsericplaying.com/2020/09/14/cascadia/ - Video walkthrough (reference only):
Game Reference
Note: External resources are provided only to understand the game rules.
The final board images and scoring card images above are the only inputs to your solution.
Expected Output
Your system should produce:
- Animal counts per player
- Score breakdown by animal type
- Final total score for each player
- Winner determination
Outputs should be clearly explained and verifiable.

Approach
You may choose any approach, including but not limited to:
- Using existing VLMs (e.g. Gemini, ChatGPT, Claude)
- Combining vision models with prompting and post-processing
- Building or fine-tuning your own model
- Applying rule-based logic on top of VLM outputs
There are no restrictions on tools, libraries, or frameworks.
Evaluation Criteria
Submissions will be evaluated based on:
- Correct interpretation of visual inputs
- Accuracy of animal identification and counting
- Correct application of scoring rules
- Clarity of reasoning and assumptions
- Reproducibility of results
Partial solutions are acceptable if clearly explained.
Deliverables
Submit your work in any reproducible format, such as:
- Google Colab notebook
- GitHub repository with a README
- ZIP file containing code and instructions
- Shared VLM prompt conversations (Gemini / Claude / ChatGPT), with explanations
Submission Instructions
Send your completed assignment to:
Email subject:VLM Challenge Submission for Machine Learning Scientist
Include:
- Link(s) to your work
- A short explanation of your approach
- Any assumptions made
Time Expectation
- Estimated effort: 4-6 hours
- No fixed deadline unless communicated separately
Notes
- This challenge is the first step in the Machine Learning Scientist interview process
- Focus on correctness and reasoning rather than UI or presentation
- Clearly state any assumptions you make
FAQs
Everything you need to know about the VLM challenge, submission process, and evaluation.
Yes. Completing this challenge is the first step in the Machine Learning Scientist hiring process at Phronetic AI.
There is no fixed deadline unless communicated separately. You may submit once your solution is complete.
Most candidates typically spend 4–6 hours. We value clarity of reasoning and correctness over speed.
Yes. You may use any existing VLMs or APIs (e.g., Gemini, ChatGPT, Claude), build your own model, or use a hybrid approach.
Yes. Public resources may be used only to understand the rules of Cascadia.
The only inputs to your solution should be the images provided on this page.
Yes. Partial solutions are acceptable if you clearly explain what works, what does not, and any assumptions made.
You may submit a Google Colab notebook, GitHub repository, ZIP file with instructions, or shared VLM prompt conversations. All submissions must be reproducible.
We evaluate submissions based on correctness, reasoning quality, clarity of assumptions, and reproducibility, not on presentation or UI polish.
Please email careers@phronetic.ai for any questions related to this challenge.