Vision–Language Model Challenge

Cascadia Board Game Scoring
This challenge is part of the Machine Learning Scientist hiring process at Phronetic AI.

Challenge Overview

You are given images representing the final state of a completed Cascadia board game played by three players.

Your task is to determine whether a Vision–Language Model (VLM) system can correctly understand the game state and compute final scores based on the official scoring rules.

Task

Using the provided images as input, answer the following:

  1. How many of each animal token is present for each player
  2. What is the score associated with each animal type
  3. The final total score for each player
  4. Who won the game

Bonus (optional):

  • Identify and count habitat tiles
  • Reason about habitat-based scoring if applicable

Inputs

Assume your inputs are:

  • Images showing the final board positions of all three players
  • Images of the wildlife scoring cards used in the game

These images represent the complete and only source of truth.

Final board positions of all three players. This image is the primary input to the challenge.

Scoring Rules Reference

The following wildlife scoring cards were used to calculate victory points in this game.

Wildlife scoring cards showing point calculation rules for bear, elk, hawk, salmon, and fox
Wildlife scoring cards showing point calculation rules for bear, elk, hawk, salmon, and fox

You may refer to publicly available resources to understand the rules of Cascadia:

Game Reference

Note: External resources are provided only to understand the game rules.
The final board images and scoring card images above are the only inputs to your solution.

Expected Output

Your system should produce:

  • Animal counts per player
  • Score breakdown by animal type
  • Final total score for each player
  • Winner determination

Outputs should be clearly explained and verifiable.

Example Cascadia scoring sheet showing animal and habitat score breakdown

Approach

You may choose any approach, including but not limited to:

  • Using existing VLMs (e.g. Gemini, ChatGPT, Claude)
  • Combining vision models with prompting and post-processing
  • Building or fine-tuning your own model
  • Applying rule-based logic on top of VLM outputs

There are no restrictions on tools, libraries, or frameworks.

Evaluation Criteria

Submissions will be evaluated based on:

  • Correct interpretation of visual inputs
  • Accuracy of animal identification and counting
  • Correct application of scoring rules
  • Clarity of reasoning and assumptions
  • Reproducibility of results

Partial solutions are acceptable if clearly explained.

Deliverables

Submit your work in any reproducible format, such as:

  • Google Colab notebook
  • GitHub repository with a README
  • ZIP file containing code and instructions
  • Shared VLM prompt conversations (Gemini / Claude / ChatGPT), with explanations

Submission Instructions

Send your completed assignment to:

📧 careers@phronetic.ai

Email subject:
VLM Challenge Submission for Machine Learning Scientist

Include:

  • Link(s) to your work
  • A short explanation of your approach
  • Any assumptions made

Time Expectation

  • Estimated effort: 4-6 hours
  • No fixed deadline unless communicated separately

Notes

  • This challenge is the first step in the Machine Learning Scientist interview process
  • Focus on correctness and reasoning rather than UI or presentation
  • Clearly state any assumptions you make

FAQs

Everything you need to know about the VLM challenge, submission process, and evaluation.

Is this challenge mandatory to apply for the Machine Learning Scientist role?

Yes. Completing this challenge is the first step in the Machine Learning Scientist hiring process at Phronetic AI.

Is there a deadline to submit the challenge?

There is no fixed deadline unless communicated separately. You may submit once your solution is complete.

How much time is this challenge expected to take?

Most candidates typically spend 4–6 hours. We value clarity of reasoning and correctness over speed.

Can I use existing Vision–Language Models or APIs?

Yes. You may use any existing VLMs or APIs (e.g., Gemini, ChatGPT, Claude), build your own model, or use a hybrid approach.

Can I use external resources to understand the game rules?

Yes. Public resources may be used only to understand the rules of Cascadia.
The only inputs to your solution should be the images provided on this page.

Is it acceptable to submit a partial solution?

Yes. Partial solutions are acceptable if you clearly explain what works, what does not, and any assumptions made.

What format should I submit my solution in?

You may submit a Google Colab notebook, GitHub repository, ZIP file with instructions, or shared VLM prompt conversations. All submissions must be reproducible.

How will submissions be evaluated?

We evaluate submissions based on correctness, reasoning quality, clarity of assumptions, and reproducibility, not on presentation or UI polish.

Who can I contact if I have questions?

Please email careers@phronetic.ai for any questions related to this challenge.