The ESP32 Family is Not a Single Chip
When engineers and hobbyists say “ESP32” they often mean the original dual-core module released in 2016. In reality, Espressif has since built a family of six distinct silicon variants under the ESP32 brand, each optimised for a different balance of performance, connectivity, power, and price. Understanding the differences saves you from buying a chip that lacks a feature you need, or overpaying for capabilities your project will never use.
This guide covers: the original ESP32, the ESP32-S2, ESP32-S3, ESP32-C3, ESP32-C6, ESP32-H2, and the popular ESP32-CAM module. By the end you will know which variant belongs in which type of project.
Original ESP32 (2016)
The original chip pairs two Xtensa LX6 cores at up to 240 MHz with 520 KB SRAM and support for up to 16 MB of external SPI flash. It combines 2.4 GHz Wi-Fi 802.11 b/g/n with Classic Bluetooth 4.2 and Bluetooth Low Energy in the same radio. The 34 GPIO pins include two 8-bit DAC outputs, 18 ADC channels across two units, 10 capacitive touch sensors, an I²S audio interface, and an ultra-low-power (ULP) co-processor for sensor monitoring during deep sleep.
The most common module is the ESP32-WROOM-32 — a metal-shielded rectangular module 18 mm × 25.5 mm housing the chip, 4 MB flash, and a PCB trace antenna. The ESP32-WROVER variant adds 4 MB or 8 MB of SPI PSRAM for memory-intensive applications like JPEG decoding and web serving. Development boards like the DevKitC and NodeMCU-32S mount these modules on a PCB with USB programming circuitry.
The original ESP32 is the best choice when you need Classic Bluetooth (A2DP audio, SPP serial), require the most mature Arduino library ecosystem, or are porting an existing project with well-tested ESP32 firmware. Its only weaknesses relative to newer variants are the lack of native USB and slightly higher idle power draw.
ESP32-S2 (2019)
The S2 replaced one of the two Xtensa LX6 cores with a single faster Xtensa LX7 core at 240 MHz and removed Bluetooth entirely. The rationale: many USB-connected devices — HID keyboards, MIDI controllers, CDC serial adapters — need USB connectivity but not Bluetooth. Adding native USB OTG (Full-Speed 12 Mbps) directly to the chip eliminates the external CH340 or CP2102 USB-to-serial chip that development boards previously required, lowering BOM cost and enabling the ESP32-S2 to enumerate as a composite USB device in user firmware.
The S2 expanded GPIO to 43 pins and added a 43-channel capacitive touch controller. It introduced a hardware security engine (RSA-3072, AES-256, SHA-2) and a digital signature peripheral for secure provisioning — features aimed at product developers shipping connected consumer goods.
Use the ESP32-S2 when your project connects via USB to a host computer (keyboards, MIDI, custom HID devices), needs a large number of GPIO, or requires hardware cryptography. Avoid it when you need Bluetooth — the S2 has none.
ESP32-S3 (2021)
The S3 is the current flagship for performance-sensitive applications. It pairs two Xtensa LX7 cores at 240 MHz (the same generation as S2 but dual-core), adds vector processing instructions for AI/ML inference acceleration, keeps the native USB OTG from S2, supports Bluetooth Low Energy 5.0, and exposes 45 GPIO pins. Internal SRAM is 512 KB; modules typically add 8 MB of octal-SPI PSRAM (a faster PSRAM variant than in WROVER) and 8 MB of octal-SPI flash.
The vector extensions let the S3 run TensorFlow Lite Micro models and audio DSP algorithms significantly faster than the original ESP32 — Espressif’s own benchmarks show 5× to 10× improvement on convolution-heavy inference tasks. This makes the S3 the chip of choice for edge AI: wake-word detection, image classification on camera frames, gesture recognition, and similar on-device inference workloads. The ESP32-S3-based boards also commonly mount the camera interface (DVP) directly, making camera-AI combinations more straightforward than on the original ESP32-CAM.
Use the ESP32-S3 for any new project that could benefit from AI inference, large RAM buffers, native USB, BLE 5.0, or simply the most modern ESP32 architecture available. The increased GPIO count also makes it attractive for multiplexed matrix keyboards and large LED arrays.
ESP32-C3 (2020)
The C3 was Espressif’s first RISC-V chip — a strategic move away from the licensed Xtensa architecture toward an open-source ISA. The 32-bit RISC-V core runs at up to 160 MHz, making it slightly slower than the dual-core ESP32 on raw throughput but extremely competitive on cost. Wi-Fi 802.11 b/g/n and BLE 5.0 are included; Classic Bluetooth is not. GPIO count is 22, with 6 ADC channels and 5 PWM outputs.
The C3’s primary advantage is price. In volume, ESP32-C3 modules cost roughly 30–40% less than original ESP32 modules, making it attractive for any connected device where the project does not need two cores, analog audio, or the expanded GPIO of the S-series. It also has native USB Serial/JTAG — not full USB OTG, but enough to program it without an external USB-to-serial chip, which simplifies hardware design.
Choose the ESP32-C3 for simple sensor nodes, smart plugs, light controllers, and any battery-powered BLE beacon where you want Wi-Fi provisioning but the lowest possible module cost. The RISC-V architecture does not affect Arduino sketch development — the ESP32 Arduino core supports C3 identically at the API level.
ESP32-C6 (2022)
The C6 upgrades the RISC-V core to a higher-performance implementation (160 MHz LP core + 160 MHz HP core), adds Wi-Fi 6 (802.11ax) with OFDMA and Target Wake Time for significantly improved power efficiency in dense Wi-Fi environments, includes BLE 5.0, and — most importantly — adds an IEEE 802.15.4 radio for Thread and Zigbee mesh networking. This is the first ESP32 variant to combine Wi-Fi, BLE, and a Thread/Zigbee radio on a single chip.
For smart home developers building Matter-certified devices, the C6 is compelling because Matter over Thread requires exactly this combination of radios. The chip can act as a Thread router, a Zigbee coordinator, or a BLE gateway while simultaneously connected to Wi-Fi. Espressif ships a full Matter SDK (esp-matter) targeting the C6 and H2.
Use the C6 when you are targeting Matter, Thread, or Zigbee connectivity, building smart home devices that need to coexist in a crowded 2.4 GHz environment (Wi-Fi 6’s orthogonal frequency division helps), or developing products for ecosystems like Apple HomeKit or Google Home that are adopting Matter.
ESP32-H2 (2022)
The H2 is the specialist in the family: it has a single RISC-V core at 96 MHz, IEEE 802.15.4 (Thread/Zigbee), BLE 5.3, and no Wi-Fi whatsoever. Removing Wi-Fi dramatically cuts power draw and silicon area. The H2 is designed for battery-powered mesh sensor nodes that relay data through a Thread network to a border router with internet access, rather than connecting directly to Wi-Fi. Think door/window sensors, occupancy detectors, and climate sensors that report through a hub.
On its own the H2 cannot talk to the internet or your home router. It is always paired with a border router device (a hub running Home Assistant with a Thread-capable radio dongle, for example). For beginners this makes it a more complex starting point. If you are building a Thread/Zigbee mesh from scratch, start with the C6 (which also has Wi-Fi for provisioning) and add H2 nodes as pure mesh endpoints once the infrastructure is established.
ESP32-CAM
The ESP32-CAM is not a distinct chip variant — it is a development module by AI-Thinker (and clones) that mounts an original ESP32 chip alongside a 2 MP OV2640 camera sensor, 4 MB of SPI PSRAM, an SD card slot, and a small onboard LED and flash LED. It is one of the most cost-effective camera-enabled microcontroller platforms available, often selling for $5–8.
The tradeoff for its low price is inconvenience: the board has no onboard USB-to-serial chip, so programming requires an external FTDI adapter connected to pins U0T, U0R, GND, and 5V, with GPIO 0 grounded during upload. The camera connector uses most of the available GPIO (GPIO 0, 2, 4, 5, 18, 19, 21, 22, 25, 26, 27, 32, 33, 34, 35), leaving only GPIO 1, 3, and 16 for user peripherals during camera operation. Despite these constraints, the ESP32-CAM is widely used for face recognition, motion detection streaming, QR code scanning, and simple surveillance applications.
If you need a camera without severe GPIO constraints, consider the newer ESP32-S3 based boards that mount a camera while still exposing more user GPIO through the expanded 45-pin array.
Variant Comparison Table
| Variant | Core | Wi-Fi | BT | 802.15.4 | USB OTG | GPIO |
|---|---|---|---|---|---|---|
| ESP32 | 2×LX6 240MHz | b/g/n | BT4.2+BLE | No | No | 34 |
| ESP32-S2 | 1×LX7 240MHz | b/g/n | No | No | Yes | 43 |
| ESP32-S3 | 2×LX7 240MHz | b/g/n | BLE5 | No | Yes | 45 |
| ESP32-C3 | 1×RV32 160MHz | b/g/n | BLE5 | No | Serial | 22 |
| ESP32-C6 | 1×RV32 160MHz | ax (Wi-Fi 6) | BLE5 | Yes | Serial | 30 |
| ESP32-H2 | 1×RV32 96MHz | No | BLE5.3 | Yes | Serial | 19 |
| ESP32-CAM | ESP32 +Camera | b/g/n | BT4.2+BLE | No | No | 4 free |
Which Variant Should a Beginner Choose?
If you are just starting with ESP32 and want the widest tutorial support, the most libraries, and the smoothest on-ramp: start with the original ESP32 DevKitC. It has been the community standard board for years, and every tutorial you find online will either be written for it specifically or translate directly. Once you understand the fundamentals — GPIO, Wi-Fi, deep sleep — you will have the context to choose a more specialised variant for your next project with confidence.
For new products or projects where you know you need USB, BLE 5, or AI inference: jump straight to the ESP32-S3. For cost-optimised connected sensors: ESP32-C3. For Matter and smart home mesh: ESP32-C6. For a camera on a budget: ESP32-CAM.