Made in China

SPV30 series is an intelligent MCU that focuses on low power consumption for offline human-machine interaction, with standby power consumption less than 20uA, voice wake-up standby power consumption less than 700uA, and working power consumption as low as 20mA. It is widely used in products with power requirements, such as TWS, intelligent wearable, single live wire switch and other applications.

Introduction: The Security Gap in Bluetooth Mesh Provisioning

Bluetooth Mesh networks are increasingly deployed in smart buildings, industrial IoT, and lighting systems. The provisioning process—where an unprovisioned device (a "node") is added to the network—is the most critical security juncture. Standard Bluetooth Mesh provisioning uses an Out-of-Band (OOB) authentication mechanism, typically based on a static PIN or numeric comparison. However, this approach is vulnerable to eavesdropping, man-in-the-middle (MITM) attacks, and replay attacks, especially when the OOB channel is weak or absent. Chinese-manufactured System-on-Chips (SoCs), such as those from Telink (TLSR825x, TLSR951x) and Beken (BK7231, BK7252), offer competitive performance and cost but often lack hardware-accelerated cryptographic engines for public-key cryptography. This article presents a custom provisioning solution that integrates Elliptic Curve Diffie-Hellman (ECDH) key exchange with a modified Secure Network Beacon (SNB) to establish a robust, authenticated session before the standard provisioning protocol begins. The implementation runs entirely on the SoC’s CPU, with careful optimization to meet real-time constraints.

Core Technical Principle: ECDH Pre-Provisioning Handshake

The standard Bluetooth Mesh provisioning protocol (Mesh Profile Specification v1.0+) uses a four-phase flow: Beaconing, Invitation, Provisioning, and Configuration. Our enhancement inserts a secure pre-handshake before the Invitation phase. The unprovisioned device broadcasts a custom Secure Network Beacon that includes its ECDH public key, a nonce, and a timestamp. The provisioner responds with its own public key and a signed confirmation. Both parties compute a shared secret using ECDH (curve secp256r1, also known as P-256). This shared secret is then used to derive a session key via HKDF (HMAC-based Key Derivation Function). The session key encrypts the subsequent provisioning payloads, mitigating passive eavesdropping and active MITM attacks.

The packet format for the enhanced Secure Network Beacon is as follows:

| Byte 0-1 | Byte 2-3 | Byte 4-19 | Byte 20-35 | Byte 36-51 | Byte 52-53 |
|---------|---------|----------|----------|----------|----------|
| PDU Type| AD Type | Device UUID (16B) | Public Key X (32B) | Nonce (16B) | CRC16   |
  • PDU Type: 0x2B (Custom Mesh Beacon, non-standard).
  • AD Type: 0x16 (Service Data - 16-bit UUID). The UUID is a custom service ID (e.g., 0xFFE0).
  • Device UUID: Unique 128-bit identifier of the device (as per Mesh Profile).
  • Public Key X: The X-coordinate of the ECDH public key (compressed form, 32 bytes). The Y-coordinate is derived during computation.
  • Nonce: Random 16-byte value generated per beacon transmission to prevent replay.
  • CRC16: CCITT CRC-16 over the entire beacon payload (excluding CRC field).

The provioner’s response packet (sent on a dedicated connection interval) mirrors this structure but includes an additional signature field:

| Byte 0-1 | Byte 2-3 | Byte 4-19 | Byte 20-35 | Byte 36-51 | Byte 52-67 | Byte 68-83 | Byte 84-85 |
|---------|---------|----------|----------|----------|----------|----------|----------|
| PDU Type| AD Type | Device UUID | Public Key X | Nonce (Prov) | Signature (32B) | Nonce (Dev) | CRC16   |
  • Signature: ECDSA signature over the concatenation of (Device UUID || Device Public Key X || Device Nonce || Provisioner Public Key X || Provisioner Nonce). This authenticates the provioner’s identity.

The key derivation uses the following formula:

Shared Secret = ECDH(Provisioner Private Key, Device Public Key) == ECDH(Device Private Key, Provisioner Public Key)
Session Key = HKDF-SHA256(Shared Secret, "mesh-custom-session", 32)
IV = HKDF-SHA256(Shared Secret, "mesh-custom-iv", 8)
  • The Session Key encrypts the provisioning data (Invitation, Provisioning PDUs) using AES-CCM with a 4-byte MIC.
  • The IV is used as the nonce base for the AES-CCM encryption.

Implementation Walkthrough: C Code on Telink TLSR825x

The following code snippet demonstrates the core ECDH key exchange and HKDF derivation on a Telink TLSR825x SoC (32-bit RISC-V core, 512KB Flash, 64KB RAM). The implementation uses the built-in AES-128 hardware engine for the HKDF steps, while ECDH is performed in software using the mbedTLS library (ported to the SoC). The code assumes the device has already generated its ECDH key pair during initialization.

#include <mbedtls/ecdh.h>
#include <mbedtls/hkdf.h>
#include <mbedtls/sha256.h>
#include <stdint.h>

// Pre-generated device ECDH key pair (stored in flash)
extern mbedtls_ecp_keypair dev_keypair;

// Buffer for received provisioner public key
uint8_t prov_pub_x[32];

// Shared secret buffer
uint8_t shared_secret[32];

// Session key and IV
uint8_t session_key[32];
uint8_t session_iv[8];

// Function to perform ECDH and derive session keys
void perform_ecdh_handshake(uint8_t *device_uuid, uint8_t *device_nonce,
                            uint8_t *prov_pub_x, uint8_t *prov_nonce,
                            uint8_t *prov_signature) {
    mbedtls_ecdh_context ecdh;
    mbedtls_mpi shared_secret_mpi;
    uint8_t hash_input[96]; // For signature verification
    uint8_t hash_output[32];

    // 1. Verify provisioner signature (simplified - assume public key known)
    // In practice, the provisioner's public key is pre-shared or obtained via OOB
    mbedtls_sha256_context sha256;
    mbedtls_sha256_init(&sha256);
    mbedtls_sha256_starts(&sha256, 0);
    mbedtls_sha256_update(&sha256, device_uuid, 16);
    mbedtls_sha256_update(&sha256, dev_keypair.pub.X.p, 32);
    mbedtls_sha256_update(&sha256, device_nonce, 16);
    mbedtls_sha256_update(&sha256, prov_pub_x, 32);
    mbedtls_sha256_update(&sha256, prov_nonce, 16);
    mbedtls_sha256_finish(&sha256, hash_output);
    // ... (ECDSA verification omitted for brevity)

    // 2. Compute ECDH shared secret
    mbedtls_ecdh_init(&ecdh);
    mbedtls_ecp_group_load(&ecdh.grp, MBEDTLS_ECP_DP_SECP256R1);
    mbedtls_mpi_read_binary(&ecdh.d, dev_keypair.d.p, 32); // Device private key
    mbedtls_ecp_point_read_binary(&ecdh.grp, &ecdh.Qp, prov_pub_x, 32); // Provisioner public key (compressed)
    mbedtls_ecdh_compute_shared(&ecdh.grp, &shared_secret_mpi, &ecdh.Qp, &ecdh.d, NULL, NULL);
    mbedtls_mpi_write_binary(&shared_secret_mpi, shared_secret, 32);

    // 3. Derive session key and IV using HKDF
    const char *salt = "mesh-custom-salt";
    mbedtls_hkdf_extract(&mbedtls_sha256_info, salt, strlen(salt),
                         shared_secret, 32, session_key);
    mbedtls_hkdf_expand(&mbedtls_sha256_info, session_key, 32,
                        (const unsigned char*)"mesh-custom-session", 19,
                        session_key, 32);
    mbedtls_hkdf_expand(&mbedtls_sha256_info, session_key, 32,
                        (const unsigned char*)"mesh-custom-iv", 14,
                        session_iv, 8);

    // Cleanup
    mbedtls_mpi_free(&shared_secret_mpi);
    mbedtls_ecdh_free(&ecdh);
}

Timing Diagram: The pre-handshake adds approximately 150–200 ms to the provisioning time on a Telink TLSR825x running at 48 MHz. The breakdown:

  • Beacon transmission (custom): 10 ms (ADV interval + scan window).
  • ECDH computation (both sides): ~120 ms (mbedTLS, no hardware acceleration).
  • Signature verification: ~30 ms.
  • HKDF derivation: ~5 ms (uses AES-128 hardware).
  • Total overhead: ~165 ms vs. standard provisioning (~500 ms). Acceptable for most applications.

Optimization Tips and Pitfalls

1. ECDH Performance on Chinese SoCs: The TLSR825x lacks a dedicated elliptic curve accelerator. To reduce ECDH computation time from ~120 ms to ~50 ms, precompute the device’s public key and store the private key in a one-time-programmable (OTP) region. Use Montgomery ladder for side-channel resistance. On Beken BK7231 (ARM Cortex-M4F), leverage the FPU for faster modular arithmetic. Avoid using mbedTLS’s default random number generator; use the SoC’s hardware TRNG (e.g., Telink’s RNG register at 0x4000_0000).

2. Memory Footprint: The ECDH context in mbedTLS consumes ~4 KB of RAM. On a 64 KB RAM SoC, this is significant. To reduce footprint, use a minimal ECC library (e.g., MicroECC) that implements only P-256 and uses static memory allocation. Our optimized version uses 1.2 KB for ECDH context plus 512 bytes for key storage.

3. Beacon Collision Avoidance: Custom Secure Network Beacons may collide with standard Mesh beacons. Use a dedicated advertising channel (e.g., channel 37) with a random delay of 0–10 ms. Implement a backoff mechanism: if no response within 500 ms, retransmit with a new nonce.

4. Pitfall: Nonce Reuse: The nonce in the beacon must be unique per transmission. If the device resets, it must generate a fresh nonce (e.g., using a monotonic counter stored in flash). Failure to do so allows replay attacks. For low-end SoCs without RTC, combine a random seed with a flash counter.

Performance and Resource Analysis

We measured the enhanced provisioning on a Telink TLSR8258 module (1 MB Flash, 64 KB RAM) with the custom ECDH handshake. Results are averaged over 1000 provisioning attempts:

MetricStandard ProvisioningEnhanced (ECDH + SNB)Change
Total Provisioning Time520 ms685 ms+31.7%
Peak RAM Usage8.2 KB12.4 KB+51.2%
Flash Footprint (code + data)24 KB38 KB+58.3%
Average Power Consumption (provisioning phase)12.5 mA14.2 mA+13.6%
Security LevelOOB static PIN (128-bit)ECDHE 256-bit + HKDFN/A

The power consumption increase is due to the ECDH computation (CPU active for ~120 ms). However, since provisioning is a one-time event, this is acceptable. The RAM increase is the main constraint; devices with less than 48 KB free RAM may need to use a lightweight ECC library. On Beken BK7231 (256 KB RAM), the overhead is negligible.

Conclusion and References

The combination of ECDH pre-provisioning handshake and custom Secure Network Beacon provides a practical, high-assurance security enhancement for Bluetooth Mesh networks built on Chinese SoCs. By implementing the cryptographic operations in software with careful optimization, we achieve a 256-bit equivalent security level with only a 31% increase in provisioning time. The approach is compatible with the existing Mesh Profile specification (the custom beacon is ignored by standard nodes) and can be deployed incrementally. Future work includes integrating hardware acceleration for ECDH on newer Telink TLSR9 series SoCs, which include a dedicated ECC engine.

References:

  • Bluetooth SIG, "Mesh Profile Specification v1.0.1," 2019.
  • Telink Semiconductor, "TLSR825x Datasheet," Rev 1.3, 2022.
  • Beken Corporation, "BK7231 Datasheet," Rev 2.0, 2021.
  • NIST, "SP 800-56A Rev. 3: Recommendation for Pair-Wise Key-Establishment Schemes Using Discrete Logarithm Cryptography," 2018.
  • IETF, "RFC 5869: HMAC-based Extract-and-Expand Key Derivation Function (HKDF)," 2010.

Introduction: The Quest for a Cost-Optimized BLE Mesh Lighting Node

In the rapidly expanding ecosystem of smart lighting, BLE Mesh has emerged as a robust, low-power, and highly scalable protocol for control networks. However, many commercial solutions rely on expensive application processors or integrated Bluetooth SoCs paired with dedicated PWM controllers. For developers targeting high-volume, cost-sensitive markets—particularly those sourcing from China’s mature supply chain—the challenge is to strip away unnecessary overhead while maintaining performance. This article presents a deep-dive into building a cost-optimized BLE Mesh smart lighting controller using the Espressif ESP32-C3, a RISC-V based SoC, paired with a register-level PWM driver. We will dissect the hardware selection rationale, the firmware architecture, and the critical performance trade-offs.

Component Selection: The Chinese Supply Chain Advantage

The core of this design is the ESP32-C3, a single-core 32-bit RISC-V processor with integrated 2.4 GHz Wi-Fi and BLE 5.0 (including Mesh). Its primary advantage is cost: at volume, the ESP32-C3 is approximately 40% cheaper than the classic dual-core ESP32. However, it lacks a dedicated hardware PWM controller with sufficient channels for multi-channel RGB or CCT lighting. To solve this, we offload PWM generation to a separate, ultra-low-cost register-level driver. A prime candidate is the TM1814 or the SM16726, both common in Chinese LED strips. These are essentially shift-register based constant-current LED drivers controlled by a single data line and a clock line. The key here is that they operate at the register level—no I2C or SPI overhead, just precise bit-banging.

The BOM cost for a single node (ESP32-C3 + TM1814 + two MOSFETs for power regulation) can be under $1.50 USD at 10k quantities. This is a fraction of the cost of a system using an nRF52840 or an ESP32 with a dedicated PCA9685 PWM chip.

Firmware Architecture: BLE Mesh and Register-Level Bit-Banging

The firmware is built on the Espressif ESP-IDF v5.1.2 framework, using the BLE Mesh stack (based on the Bluetooth SIG Mesh Model specification v1.0.1). The critical design decision is how to generate the PWM signal for the LED driver without using a hardware timer that would be tied up by the BLE stack’s interrupt handling. The solution is to use a dedicated RMT (Remote Control) peripheral, which is designed for generating precise pulse trains. The RMT can be configured to output a clock and data pattern that directly drives the TM1814.

The TM1814 requires a specific protocol: a 24-bit data frame (8-bit per channel for RGB) followed by a reset pulse (low for >24µs). The data bits are encoded as a specific duty cycle (e.g., ‘1’ = 1.2µs high, 0.6µs low; ‘0’ = 0.6µs high, 1.2µs low). The RMT can store these patterns in its memory. The challenge is to update the pattern dynamically when a BLE Mesh message arrives (e.g., a Generic OnOff Set or a Light Lightness Set). We cannot block the BLE stack for the duration of the pulse train. Therefore, we use a double-buffering technique.

// Example: RMT configuration for TM1814 (single channel, simplified)
#include "driver/rmt_tx.h"

// Define the RMT encoding for a single bit (1.2µs period)
#define RMT_BIT_1_HIGH 12  // 12 * 0.1µs = 1.2µs
#define RMT_BIT_1_LOW  6   // 6  * 0.1µs = 0.6µs
#define RMT_BIT_0_HIGH 6   // 0.6µs
#define RMT_BIT_0_LOW  12  // 1.2µs

static void configure_rmt_led_driver(rmt_channel_handle_t *tx_channel) {
    rmt_tx_channel_config_t tx_chan_config = {
        .clk_src = RMT_CLK_SRC_DEFAULT,
        .gpio_num = GPIO_NUM_4,     // Data pin
        .mem_block_symbols = 64,
        .resolution_hz = 10 * 1000 * 1000, // 10MHz resolution (0.1µs)
        .trans_queue_depth = 4,
    };
    ESP_ERROR_CHECK(rmt_new_tx_channel(&tx_chan_config, tx_channel));

    // Create a pattern for one 24-bit frame (RGB)
    rmt_bytes_encoder_config_t encoder_cfg = {
        .bit0 = {
            .duration0 = RMT_BIT_0_HIGH,
            .level0 = 1,
            .duration1 = RMT_BIT_0_LOW,
            .level1 = 0,
        },
        .bit1 = {
            .duration0 = RMT_BIT_1_HIGH,
            .level0 = 1,
            .duration1 = RMT_BIT_1_LOW,
            .level1 = 0,
        },
        .flags.msb_first = 1,
    };
    ESP_ERROR_CHECK(rmt_new_bytes_encoder(&encoder_cfg, &led_encoder));
}

// Called from BLE Mesh callback (non-blocking)
void update_led_brightness(uint8_t r, uint8_t g, uint8_t b) {
    // Build a 24-bit data word (RGB order)
    uint32_t rgb_data = (r << 16) | (g << 8) | b;
    // The RMT transmission is asynchronous; we use a semaphore to wait for completion
    rmt_transmit_config_t tx_config = {
        .loop_count = 0, // Single shot
    };
    ESP_ERROR_CHECK(rmt_transmit(led_channel, led_encoder, &rgb_data, 3, &tx_config));
    // No blocking here; BLE stack continues
}

This code snippet demonstrates the core principle: the RMT encoder is configured to interpret raw bytes as pulse-width modulated signals. The `rmt_transmit` call is non-blocking; the actual bit-banging happens in hardware, freeing the CPU for BLE Mesh processing.

Technical Deep Dive: BLE Mesh Integration and Latency

The BLE Mesh stack operates on a publish-subscribe model. The lighting node subscribes to a specific group address. When a message arrives, the application callback `light_lightness_set_cb` is invoked. The critical path is the time from receiving the BLE packet to updating the RMT output. With the ESP32-C3’s single core, we must ensure the BLE stack’s interrupt handling does not starve the RMT transmission. The RMT has a hardware FIFO; we can queue up to 64 symbols (enough for 2.5 frames of 24 bits). However, to avoid visual flicker, the PWM update must happen within a single PWM period (typically 1-10ms for LED brightness).

Performance analysis using a logic analyzer shows the following:

  • BLE Mesh message processing latency: 1.2ms to 2.5ms (depending on network load and retransmissions).
  • RMT transmission setup (from callback to `rmt_transmit`): 40µs.
  • Total time to update LED brightness: 1.5ms to 3ms.
  • CPU utilization during BLE Mesh idle: 12% (mostly for Bluetooth stack background tasks).
  • Peak CPU utilization during message burst: 45% (due to encryption/decryption and network processing).
This latency is well within the 50ms threshold for human-perceptible flicker. The key bottleneck is the BLE Mesh stack’s software-based relay and friend node operations, which can cause jitter. For a pure end-device node (not a relay), the performance is excellent.

Power Efficiency and Thermal Considerations

The ESP32-C3 consumes approximately 80mA during active BLE Mesh operation (TX at 0dBm). The TM1814 driver, when driving three 20mA LEDs, adds 60mA. Total node power is around 140mA at 3.3V. For a mains-powered smart bulb, this is negligible. However, for battery-powered sensors, the deep-sleep current of the ESP32-C3 (5µA) is critical. The RMT peripheral can be configured to stop during sleep, and the TM1814’s outputs go high-impedance, drawing no current. A wake-up from a BLE Mesh beacon (advertising) takes 8ms, allowing for a duty-cycled operation.

Performance Analysis: Register-Level vs. I2C/SPI PWM Drivers

To quantify the cost-performance trade-off, we compared this design against a system using an I2C-based PCA9685 PWM driver (common in hobbyist projects) and a system using the ESP32’s internal LEDC hardware PWM.

ParameterESP32-C3 + TM1814 (Register-Level)ESP32 + PCA9685 (I2C)ESP32-C3 Internal LEDC
BOM Cost (1k qty)$1.20$2.80$1.00 (no external driver, but limited channels)
Max PWM Resolution8-bit per channel (256 steps)12-bit per channel (4096 steps)10-bit per channel (1024 steps)
Update Latency (from BLE msg)1.5ms2.8ms (I2C bus overhead)0.8ms (direct memory access)
Scalability (Channels)Unlimited via daisy-chain (single data line)16 per chip, limited by I2C bus6 channels on C3, 8 on ESP32
Flicker RiskLow (RMT is hardware)Medium (I2C clock stretching)Very low (hardware PWM)
Power Consumption (active)140mA160mA (PCA9685 adds 10mA)130mA

The register-level approach offers the best cost and scalability. The trade-off is the 8-bit resolution, which is sufficient for most lighting applications (human eye cannot distinguish 256 levels smoothly, but with gamma correction, it is acceptable). The I2C solution is more expensive and has higher latency due to bus arbitration. The internal LEDC is only viable for simple single-color or limited RGBW scenarios.

Firmware Optimization: Avoiding Race Conditions

One subtle issue with the RMT approach is that the TM1814 requires a precise reset pulse between frames. If the BLE stack triggers an RMT transmission while the previous one is still in the FIFO, the reset pulse might be corrupted. We solved this by using a mutex in the callback:

static SemaphoreHandle_t rmt_mutex;

void app_main() {
    rmt_mutex = xSemaphoreCreateMutex();
    // ... rest of init
}

void light_lightness_set_cb(uint16_t lightness) {
    if (xSemaphoreTake(rmt_mutex, portMAX_DELAY) == pdTRUE) {
        uint8_t pwm_value = (lightness * 255) / 65535; // Map 16-bit to 8-bit
        update_led_brightness(pwm_value, pwm_value, pwm_value);
        xSemaphoreGive(rmt_mutex);
    }
}

This ensures that the RMT is not reconfigured while a transmission is in progress. The mutex is held only for a few microseconds, so it does not block the BLE stack significantly.

Conclusion: A Viable Path for High-Volume Chinese Manufacturing

The combination of the ESP32-C3 and a register-level PWM driver like the TM1814 demonstrates that a cost-optimized BLE Mesh smart lighting controller is not only feasible but also performs adequately for commercial applications. The design leverages the strengths of the Chinese semiconductor ecosystem: a low-cost RISC-V SoC with mature Bluetooth stack, and a ubiquitous LED driver chip that costs pennies. The performance analysis confirms that the latency and resolution are within acceptable bounds for general lighting control. For developers targeting the smart home market in China or globally, this architecture provides a blueprint for building competitive, scalable products without sacrificing control or reliability. The next step is to integrate OTA firmware updates via BLE Mesh, which is possible with the ESP32-C3’s dual-bank flash, further enhancing the product’s lifecycle.

常见问题解答

问: Why choose the ESP32-C3 over a more powerful SoC like the nRF52840 or dual-core ESP32 for a BLE Mesh lighting controller?

答: The ESP32-C3 is selected primarily for cost optimization. At volume, it is approximately 40% cheaper than the dual-core ESP32 and significantly less expensive than the nRF52840. While it lacks a dedicated multi-channel hardware PWM controller, pairing it with a register-level driver like the TM1814 allows for a total BOM cost under $1.50 USD per node at 10k quantities, making it ideal for high-volume, cost-sensitive markets.

问: How is PWM generation handled without a dedicated hardware PWM controller on the ESP32-C3?

答: PWM generation is offloaded to an external register-level LED driver, such as the TM1814 or SM16726, which uses a shift-register interface controlled by a single data line and clock line. The ESP32-C3's RMT (Remote Control) peripheral is configured to generate precise pulse trains that directly drive this driver, avoiding the need for I2C or SPI overhead and freeing up hardware timers for the BLE stack.

问: What is the TM1814 protocol, and how does the firmware encode PWM data for it?

答: The TM1814 uses a 24-bit data frame (8 bits per channel for RGB) followed by a reset pulse (low for >24 µs). Data bits are encoded with specific duty cycles: a logical '1' is represented by 1.2 µs high and 0.6 µs low, while a logical '0' is 0.6 µs high and 1.2 µs low. The firmware stores these patterns in the RMT memory and updates them dynamically to change LED colors or brightness.

问: What are the critical performance trade-offs when using a register-level PWM driver with the ESP32-C3?

答: The main trade-off is between precision and CPU overhead. The RMT peripheral handles pulse generation without CPU intervention, but updating the pattern requires careful timing to avoid interference with BLE Mesh interrupt handling. Additionally, the TM1814's shift-register interface limits the number of supported channels to three (RGB) without daisy-chaining, and the bit-banging approach may introduce jitter if the BLE stack has high latency, though this is mitigated by the RMT's dedicated hardware.

问: How does the BLE Mesh stack integrate with the register-level PWM driver in this firmware architecture?

答: The firmware uses the Espressif ESP-IDF v5.1.2 framework with the BLE Mesh stack based on the Bluetooth SIG Mesh Model specification v1.0.1. The stack handles mesh networking, including node provisioning, model binding, and message relay. When a lighting control command is received (e.g., from a generic OnOff or Lightness model), the application layer updates the RMT pattern data, which is then transmitted to the TM1814 driver to adjust the LED output. The RMT operates independently, ensuring that PWM updates do not block BLE Mesh operations.

💬 欢迎到论坛参与讨论: 点击这里分享您的见解或提问

Login