Skip to content

Python SDK

The Python SDK is for gateway programs, validation scripts, or existing Python services. The package already handles MQTT connection, command subscription, parameter validation, command responses, state reports, and event reports. Replace the default state update logic with real sensors, actuators, or existing service calls.

Use Cases

  • A gateway program connects a group of devices or an external system to a Device Agent.
  • Scripts validate a DeviceSpec, commands, telemetry, and events quickly.
  • An existing Python service already reads device data or calls business systems.

Package Contents

FilePurpose
src/main.pyDevice entry point for connection, subscription, responses, status reports, and events
src/voice_client.pyDevice-side voice WebSocket client
device-spec.jsonCurrent DeviceSpec, used for command validation and field mapping
.env.exampleMQTT, namespace, productId, deviceId, and connection settings
README.mdSetup, run, and development guide for this package
_references/Shared SDK source for checking message shapes

For a real device, you mainly work in src/main.py: state generation, command handling, and event triggers.

Access Steps

  1. Download the Python SDK package.
  2. Copy .env.example to .env and update MQTT broker or credentials if needed.
  3. Start the program and confirm the device comes online.
  4. Replace the default state update logic in apply_command_to_state().
  5. Return to the Device Agent workspace and test commands, state, and events.
bash
cp .env.example .env
uv run device-agent-toolkit

You can also run the entry file directly:

bash
uv run python src/main.py

Implement Device Logic

The default code validates each command and its parameters against device-spec.json, then merges command parameters into the current state. For real access, update apply_command_to_state():

python
def apply_command_to_state(device_spec, state, command, params):
    next_state = deepcopy(state)

    if command == "set_temperature":
        target = params["target_temperature"]
        call_thermostat_service(target)
        next_state["target_temperature"] = target

    if "updated_at" in next_state:
        next_state["updated_at"] = time.strftime("%Y-%m-%dT%H:%M:%SZ", time.gmtime())

    return next_state

Use this function to perform the real device action, such as reading sensors, calling a gateway interface, controlling a relay, or translating the command into an existing system API. The returned state is used for command responses and state reports.

If the DeviceSpec defines events, report them from business logic with publish_event():

python
publish_event("temperature_alarm", {
    "current_temperature": 32.5,
    "level": "warning",
})

Voice Access Code

The Python SDK includes src/voice_client.py for device-side voice conversations. It connects to /ws/voice, sends 16 kHz mono Int16LE PCM audio, and receives ASR text, agent replies, and TTS audio through event callbacks.

python
from voice_client import VoiceClient

voice = VoiceClient(
    ws_url="ws://127.0.0.1:3001/ws/voice",
    device_id="device-001",
    product_id="agent-001",
)

await voice.connect()
await voice.start_listening("manual")
await voice.send_audio(pcm_chunk)
await voice.stop_listening()

For a real device, connect microphone capture and speaker playback to these calls. Voice service, voice type, and credential settings are covered in Voice Interaction.

Vision Recognition Code

src/main.py includes the command flow for photo recognition. When the DeviceSpec contains one of these commands, the program runs vision recognition before normal state updates:

  • capture_and_recognize
  • take_photo_vision
  • vision_recognize
  • photo_identify

The device first checks imageDataUrl and imageBase64 in command parameters, then VISION_FALLBACK_IMAGE_DATA_URL in .env. If none is available, it calls capture_local_vision_image(). For real access, implement that function with a camera, screenshot, or image file source.

python
def capture_local_vision_image():
    return {
        "mimeType": "image/jpeg",
        "imageBase64": read_camera_frame_as_base64(),
        "source": "sdk-camera",
    }

The program uploads the image to /api/vision/frames, then calls /api/chat with visionRefs. This is one photo recognition round per command, not continuous video streaming.

Verify Access

After the program starts, return to the Device Agent workspace and confirm:

  1. The device appears in the device list and is online.
  2. Current state shows fields reported by the Python program.
  3. A conversation command executes the logic in apply_command_to_state().
  4. If publish_event() is called, the event appears in recent events.