AI Edge Gallery Agent Skills
Table of contents
- Introduction
- How Skills Work
- Text-Only Skills: The Simplest Case
- JavaScript (JS) Skills
- Native Skills
- How to Add Skills in the Gallery App
- Share Skills with Community
- Tips and Tricks
- Skill Examples
Introduction
An Agent Skill is a modular set of capabilities that extends the functional reach of a Large Language Model (LLM) within the AI Edge Gallery app. By giving the LLM new capabilities and domain-specific knowledge, skills reduce the need for repetitive prompt instructions, and eliminate the barriers for LLMs to discover and integrate new tools dynamically.
How Skills Work
At a high level, each skill is defined by a SKILL.md file that contains
essential metadata and step-by-step instructions. When a user enters a prompt,
the LLM reviews the name and the descriptions of available skills appended to
its system prompt. If the user's request aligns with a skill, the LLM invokes it
automatically.
Unlike cloud-based LLMs that can spin up containers or access a terminal to run Python scripts or CLI tools, on-device LLMs operate within a sandboxed mobile environment. They cannot easily execute arbitrary system commands or local scripts due to security and resource constraints.
To overcome this, AI Edge Gallery adapts by focusing on two primary execution paths:
-
JavaScript Skills: Running logic inside a lightweight, hidden webview, which provides a cross-platform execution environment for custom logic.
-
Native App Intents: Leveraging the Android/iOS operating system's built-in capabilities (like sending email / text messages).
Text-Only Skills: The Simplest Case
The simplest type of skill is a text-only skill, which provides the LLM with a specific persona or scenario data without requiring external code.
Folder Structure and Naming Convention
To create a skill, you must follow a standardized directory structure:
- Directory Name: Create a dedicated folder for your skill using
kebab-case (e.g.,
fitness-coach). - SKILL.md: This is the only required file for a text-only skill and must reside in the root of your skill folder.
fitness-coach/
└── SKILL.md
The SKILL.md File
The core of the skill is the SKILL.md file. It must contain a frontmatter
metadata section enclosed by --- lines, followed by the instructions for
the LLM.
Example SKILL.md for a Text-Only Skill:
---
name: fitness-coach
description: A cheerful, high-energy fitness coach that provides motivational workout routines.
---
# Cheerful Fitness Coach
## Persona
You are an incredibly enthusiastic and supportive fitness coach! Your goal is
to make exercise feel like a party. Always use upbeat language, plenty of
encouraging emojis, and focus on the "fun" of moving your body.
## Instructions
When the user asks for a workout:
1. Start with a high-energy greeting (e.g., "Ready to crush it?").
2. Provide a 15-minute high-intensity routine that is easy to follow.
3. End with a massive "virtual high-five" and a reminder of how awesome they are
for showing up today! 🌟✨
The LLM uses the Name and Description in the metadata to determine if the skill is relevant to a user's query. If triggered, the Instructions are loaded into the model's context to guide its behavior.
JavaScript (JS) Skills
Because Python is often unsuitable for on-device LLMs within mobile applications, the AI Edge Gallery uses JavaScript-based scripts housed in HTML files to execute custom logic.
How JS Skills Work
JS skills execute logic by loading an HTML file into a hidden webview. The app
calls your skill's logic through a globally exposed asynchronous function named
ai_edge_gallery_get_result that must be attached to the window object.
Step-by-Step: Creating a Full JS Skill
The directory structure for a JS skill is the same as for text-only skills, but
with an extra scripts directory to put your index.html and related
JavaScript files.
Step 1: Create the directory structure
Your folder name must be in kebab-case and match your skill name.
my-js-skill/
├── SKILL.md
└── scripts/
└── index.html
Step 2: Write the SKILL.md file
You must explicitly instruct the LLM to call the run_js tool and define the
exact JSON schema it should pass as data.
---
name: my-js-skill
description: Calculate the hash of a given text.
---
# Calculate hash
## Instructions
Call the `run_js` tool with the following exact parameters:
- script name: index.html
- data: A JSON string with the following field:
- text: String. The text to calculate hash for.
Tip
If your main entry point is named index.html, the script name line in the
instructions above is optional. The LLM will look for index.html within the
scripts/ directory by default if no other file is specified.
Step 3: Create the index.html entry point
Embed your JavaScript logic inside scripts/index.html. You must define an
asynchronous function ai_edge_gallery_get_result and expose it on window.
This function receives a single argument, data, which is a stringified JSON
string passed from the app containing the parameters from the LLM, as described
in the SKILL.md instructions. Inside this function, you must parse this
data, execute your logic, and return a stringified JSON object. This
returned object must contain either a result field on success or an error
field on failure.
<!DOCTYPE html>
<html lang="en">
<head></head>
<body>
<script>
window['ai_edge_gallery_get_result'] = async (data) => {
try {
const jsonData = JSON.parse(data);
const processedData = await yourImplementation(jsonData.text);
return JSON.stringify({
result: processedData
});
} catch (e) {
console.error(e);
return JSON.stringify({
error: `Failed: ${e.message}`
});
}
};
async function yourImplementation(text) {
return text + " processed!";
}
</script>
</body>
</html>
Tip
Think of index.html as a "headless" execution environment that leverages
the full power of the web ecosystem within a standard mobile webview. This
setup allows you to move beyond basic scripts by making fetch() calls to
third-party APIs, integrating external libraries via CDN or relative paths in
the <script> tag, and utilizing advanced Web APIs like WebAssembly. For more
complex projects, you can maintain a clean architecture by splitting your
logic into separate .js files within the scripts/ directory and importing
them directly into your main index.html entry point.
Returning an Image
To return an image to the chat, assign a base64 encoded string to the
image.base64 field in your returned JSON.
Example:
window['ai_edge_gallery_get_result'] = async (data) => {
try {
return JSON.stringify({
result: "Image generated.",
image: {
base64: "imageBase64String"
}
});
} catch (e) {
return JSON.stringify({
error: e.message
});
}
};
Returning a Webview
You can return an inline webview that the app will render in the chat. You can
specify a url (either absolute or relative to an assets folder) and an
aspectRatio (which defaults to 1.333 if omitted).
Example:
window['ai_edge_gallery_get_result'] = async (data) => {
try {
return JSON.stringify({
result: "Here is the interactive view.",
webview: {
url: "webview.html",
aspectRatio: 1.0
}
});
} catch (e) {
return JSON.stringify({
error: e.message
});
}
};
Here is how files should be organized:
my-interactive-skill/
├── SKILL.md
├── scripts/
│ └── index.html <-- The hidden logic runner
└── assets/
└── webview.html <-- The HTML rendered in the chat UI
Tip
You can pass dynamic data from your background logic (index.html) to your
interactive UI (webview.html) by appending URL query parameters to the
webview URL. In your script, construct the URL string to include key-value
pairs, such as webview.html?data=value. Your interactive page can then
retrieve this information using the URLSearchParams API to customize the
user interface based on the LLM's output.
Passing Secrets
If your JS script requires an API key or token, do not pass it through the LLM prompt. Instead, the AI Edge Gallery app provides a secure mechanism: it will display a native dialog to the user to input the required secret when the JS skill is called, which is then passed directly to your script.
- Add
require-secret: trueto yourSKILL.mdmetadata. - (Optional) Add
require-secret-description: some descriptionto yourSKILL.mdmetadata. This will be shown in the prompt dialog. - Add a second parameter to your JS entry function to receive the secret.
Example SKILL.md snippet:
---
name: some-api-skill
description: Fetches secure data.
metadata:
require-secret: true
require-secret-description: Go to Github settings page to copy your token.
---
Example index.html snippet:
window['ai_edge_gallery_get_result'] = async (data, secret) => {
try {
const jsonData = JSON.parse(data);
// Use the secret variable to authenticate your API call
const response = await fetch("https://api.example.com/data", {
headers: {
"Authorization": `Bearer ${secret}`
}
});
const resultText = await response.text();
return JSON.stringify({
result: resultText
});
} catch (e) {
return JSON.stringify({
error: e.message
});
}
};
Native Skills
Native skills map instructions to predefined tools in the Gallery app, such as
the run_intent tool. This allows the LLM to interact with the Android device
natively to perform actions like sending emails or text messages.
To use the run_intent tool, you must instruct the LLM to call it with two
exact parameters:
intent: The native action to run.parameters: A JSON string containing the required parameter values for the intent.
Example SKILL.md for Native Intents (Email and Text Message):
---
name: send-email
description: Send an email.
---
# Send email
## Instructions
Call the `run_intent` tool with the following exact parameters:
- intent: send_email
- parameters: A JSON string with the following fields:
- extra_email: the email address to send the email to. String.
- extra_subject: the subject of the email. String.
- extra_text: the body of the email. String.
Important
While the app currently supports sending email and sending text out of the box, supporting additional native intent-based skills requires updating the app's source code. To add new capabilities, such as opening the camera, setting alarms, etc., you must define the logic within the app's codebase. Developers can refer to IntentHandler.kt to see how existing intents are mapped and to learn how to register new custom intents for the LLM to invoke.
How to Add Skills in the Gallery App
There are three ways to add a skill to the app:
Add from Community-Featured Skills
We curated a list of skills contributed from our community. To try out a skill from this list, follow the steps below:
Steps:
-
Enter the Agent Skills use case with your selected model, and navigate to the Skill Manager by tapping the "Skills" chip.
-
Tap the (+) button and select the Add skill from featured list option.
-
From there, simply tap a skill from the list to automatically add it to the system.
Add from a URL
For easier sharing, you can host your skill on a web server, and add the skill to the app by using the skill url.
Steps:
-
Enter the Agent Skills use case with your selected model, and navigate to the Skill Manager by tapping the "Skills" chip.
-
Tap the (+) button and select the Load skill from URL option.
-
Enter the skill url in the popup dialog. The url should be pointing to the skill folder itself.
Verify your URL: Ensure the URL is correct by loading the
SKILL.mdfile in your browser (e.g.,https://your/url/SKILL.md). If the raw content of the file displays correctly, your URL is ready to use (excluding theSKILL.mdsuffix).
Important
To avoid webview loading failures, you must host your JS skill assets on
a true web hosting service like GitHub Pages, Cloudflare, etc. Standard
GitHub repository URLs and raw.githubusercontent.com serve files as
text/plain, which lacks the proper MIME types required for execution. Always
use the deployment URL provided by your web host.
Tip
A tip if you want to use GitHub Pages to serve your skills: By default, GitHub Pages uses Jekyll to process files, which can automatically convert .md files into .html. Because the AI Edge Gallery app requires access to the raw SKILL.md file to parse instructions, you must disable this behavior:
- Create an empty file named
.nojekyllin the root of your repository. - Commit and push this file to your main branch.
This ensures GitHub Pages serves your Markdown files as-is rather than attempting to render them as static webpages.
Import from a Local File
You can load skills directly from your Android device's file system.
Steps:
-
Connect your Android device to your computer and push your entire skill folder (e.g.,
my-js-skill/) onto the device (e.g. to theDownloadfolder).adb push my-js-skill/ /sdcard/Download/ -
Enter the Agent Skills use case with your selected model, and navigate to the Skill Manager by tapping the "Skills" chip.
-
Tap the (+) button and select the Import local skill option.
-
Use the Android file picker to select the directory containing your
SKILL.mdfile. The app will copy the directory into its internal storage and make the skill available.
Share Skills with Community
We've created a dedicated GitHub Discussions category for users to showcase their skills. Follow these steps to share your custom skills with the global AI Edge Gallery community:
-
Click "New discussion" button.
-
Follow the instructions and fill in the form to share your skill.
Tips and Tricks
Link to Your Skill Homepage
You can make your skill name clickable within the Skill Manager UI by adding a
homepage field to the metadata in your SKILL.md file. This is a great way
to link users to your GitHub repository, documentation, or personal website.
Example:
---
name: fitness-coach
description: A cheerful, high-energy fitness coach.
metadata:
homepage: https://github.com/your-username/fitness-coach-skill
---
Debug JS Skill In App
When running a JavaScript skill, you can expand the execution panel to inspect the call details and the specific data passed to your script. This panel also provides access to real-time console logs.
Skill Examples
-
Kitchen adventure
Act as a dungeon master for a text-based adventure set in a world where everyone is a sentient kitchen appliance -
Calculate hash
Calculate the hash of a given text. -
Query Wikipedia
Query summary from Wikipedia for a given topic.
-
QR code
Generate QR code for a given url. -
Interactive map
Show an interactive map view for the given location. -
Mood tracker
A simple mood tracking skill that stores and visualizes your daily mood and comments. -
Virtual piano
Show a virtual piano to play music
-
Text spinner
Spin the given text on my head.
-
Mood music
Suggest or play music based on the user's mood, including analyzing images or audio.
-
Restaurant roulette
Show a roulette wheel to allow user to randomly select a restaurant based on location and cuisine.
-
Send email
Send an email.
Check out more examples from our community-contributed skills.