Vision-driven Android device automation using Midscene. Operates entirely from screenshots — no DOM or accessibility labels required. Can interact with all visible elements on screen regardless of technology stack. Control Android devices with natural language commands via ADB. Perform taps, swipes, text input, app launches, screenshots, and more. Trigger keywords: android, phone, mobile app, tap, swipe, install app, open app on phone, android device, mobile automation, adb, launch app, mobile screen, test android app, verify mobile app, QA on phone, check the app on android, test on device, see if the app works on phone, end-to-end test on android, visual verification on mobile Powered by Midscene.js (https://midscenejs.com)
This skill does not declare a tool allowlist. The agent host applies whatever default tools are available at runtime.
SKILL.md / Manifest
https://raw.githubusercontent.com/web-infra-dev/midscene-skills/main/skills/android-automation/SKILL.mdRegistry
github (via claudemarketplaces.com)