Sharding MySQL seamlessly using Vitess

Published on December 15, 2023

Initially, Kubernetes focused on stateless workloads, but its evolution with features like StatefulSets, CSI support, and the operator pattern now enables the management of stateful workloads. Vitess, a notable solution for scaling MySQL databases, played a pivotal role in this shift.

Developed by YouTube's SRE team and now part of the Cloud Native Computing Foundation, Vitess helps Kubernetes effectively manage and scale MySQL databases. It offers unique capabilities for handling partitioned databases, surpassing the constraints of some cloud-provider database solutions like AWS Aurora. Two notable success stories, GitHub and Activision, have embraced the power of Vitess. Watch GitHub's story and Activision's story to gain insights into their experiences.

While AWS Aurora is an option for partitioned databases, it comes with limitations, such as requiring handling of partitioning logic within the application and involving multiple manual steps for Reshard workflows. Vitess, however, extends beyond these constraints, offering sophisticated partitioning and scaling capabilities for cloud-native environments running on Kubernetes.

In this guide, we'll explore scaling your MySQL database using Vitess, focusing on a "Car Booking Service" as an example. See the complete code in the companion repo.

Deciding the tables

At the core, we'll need two primary tables: car and booking tables. Additionally, if transaction handling is a concern, consider incorporating a third table. To ensure compatibility with Vitess, we'll operate under the repeatable-read isolation level. For secure booking transactions, we'll implement materialized locking. This involves locking rows within the car_time_slot_lock table, before placing actual booking in the booking table. This knoweledge is not essential to follow this blog. If this concept seems unfamiliar, I recommend delving into the "Transactions" chapter of "Designing Data-Intensive Applications" by Martin Kleppmann for a more comprehensive understanding.

Below are the initial table structures. Assume that user-related information is sourced from an authentication service via user sessions / user tokens:

Installing Vitess

Follow Vitess Operator for Kubernetes guide alongside the given VitessCluster manifest, SQL tables, and VSchema.

Following VitessCluster manifest will spin up two vttablets (Vitess controlled MySQL databases), one being primary and other being a replica which act as a standby replica. Optionally, you can include read-only type replicas. Note that we have a single partition with two replicas.

VitessCluster manifest:

apiVersion: planetscale.com/v2
kind: VitessCluster
metadata:
  name: car-booking-service
spec:
 ...
 ...

  keyspaces:
  - name: car-listing-service
    ...
    partitionings:
    - equal:
        parts: 1
        shardTemplate:
          ...
          ...
          tabletPools:
          - cell: zone1
            type: replica
            replicas: 2
            vttablet:
            ...
            ...
  updateStrategy:
    type: Immediate
...

SQL tables:

CREATE TABLE car (
    car_id CHAR(36) PRIMARY KEY,
    make VARCHAR(255) NOT NULL,
    model VARCHAR(255) NOT NULL,
    year INT NOT NULL,
    mileage INT,
    price DECIMAL(10, 2) NOT NULL,
    description TEXT,
    image_url VARCHAR(255)
);

CREATE TABLE car_time_slot_lock (
    car_id CHAR(36) NOT NULL,
    start_time DATETIME NOT NULL,
    end_time DATETIME NOT NULL,
    PRIMARY KEY (car_id, start_time, end_time),
    FOREIGN KEY (car_id) REFERENCES car(car_id)
);

CREATE TABLE booking (
    id CHAR(36) PRIMARY KEY,
    user_id CHAR(36) NOT NULL,
    car_id CHAR(36) NOT NULL,
    start_time DATETIME NOT NULL,
    end_time DATETIME NOT NULL,
    created_at DATETIME NOT NULL,
    cancelled_at DATETIME,
    FOREIGN KEY (car_id) REFERENCES car(car_id)
);

VSchema:

{
  "sharded": false,
  "tables": {
    "car": {},
    "booking": {},
    "car_time_slot_lock": {}
  }
}

Main database queries we have in this application

Perform a JOIN operation between the car and booking tables to fulfill user search requests, displaying already booked time slots for the cars:

SELECT filtered_car.id, filtered_car.make, filtered_car.model, 
  filtered_car.year, filtered_car.mileage, filtered_car.price, 
  JSON_ARRAYAGG(JSON_OBJECT('start', booking.start_time, 'end', booking.end_time)) AS booked_times 
    FROM (
      SELECT * FROM car
        WHERE 
          make = <make> AND
          model = <model> AND 
          year BETWEEN <year_start> AND <year_end> AND
          mileage BETWEEN <mileage_start> AND <mileage_end> AND 
          price BETWEEN <price_start> AND <price_end>
    ) AS filtered_car LEFT JOIN booking 
        ON filtered_car.id = booking.car_id 
      GROUP BY filtered_car.id
  LIMIT <limit> OFFSET <offset>;

Place a booking for a car by first locking rows within the car_time_slot_lock table. Then, proceed to INSERT rows into the booking table:

BEGIN TRANSACTION;

# FOR UPDATE caluse lock all the consecutive time slot rows 
# (that we are going to book the car for) for the car_id
SELECT * FROM car_time_slot_lock 
  WHERE car_id = <car_id> 
    AND start_time >= <booking_start_time> 
    AND end_time <= <booking_end_time> FOR UPDATE;

# Check whether there are overlapping time slots for the the same car
SELECT * FROM booking 
  WHERE car_id = <car_id> 
    AND end_time > <booking_start_time> 
    AND start_time < <booking_end_time>;

# if above query has no results then complete the booking transaction insertion
INSERT INTO booking 
  (id, car_id, user_id, 
    start_time, end_time, created_at) 
  VALUES 
  (<uuid>, <car_id>, <user_id>, 
    <booking_start_time>, <booking_end_time>, <created_at_timestamp>);

COMMIT;

Why need sharding?

At the outset, we began with a single database housing the aforementioned tables. As our user base has significantly grown and our fleet of cars expanded due to company scaling, the frequency of car bookings has notably increased, especially during peak seasons. To address the challenges posed by higher user demands, we aim to enhance the search experience for users and streamline transaction processing to handle the increased load, all while striving for faster performance than before. To tackle these bottlenecks, we're considering sharding the database.

In this scenario, the primary approach involves partitioning the existing data across a new database and subsequently rerouting live traffic from the unpartitioned database to the partitioned one. Vitess simplifies this entire process seamlessly. Notably, Vitess offers an impressive feature known as VReplication. This feature allows the creation of new tables within a fresh database and the rapid population of data into these tables in near real-time. This capability proves invaluable as it enables the replication of live data and facilitates the assessment of performance for new queries — an exceptional functionality.

Decision making for sharding

To efficiently manage the cars and booking data across shards, we've opted for hash-based partitioning using the car_id for both the car and booking tables. By employing this method, we execute queries within a single shard, leveraging the scatter and gather approach, which accelerates query execution achieving double the speed with two shards and potentially fivefold with five shards.
Furthermore, in enhancing transactional processing, we've extended the sharding strategy to include the car_time_slot_lock table again using the car_id as the sharding key. This configuration ensures that booking transactions now occur exclusively within individual shards, effectively doubling our transaction processing capacity by enabling independent processing across two databases.

The beauty of this approach is that the application doesn't need to manage the scatter and gather logic. VTgate seamlessly handles this complexity, allowing the application to operate as if there's only a single database.

To enable Vitess to comprehend our new database schema, we're required to apply a new VSchema, outlining how tables should be sharded. We are using unicode_loose_md5 as the hashing method as the columns are uuid strings.

{
  "sharded": true,
  "vindexes": {
    "uuid_hash": {
      "type": "unicode_loose_md5"
    }
  },
  "tables": {
    "car": {
      "column_vindexes": [
        {
          "column": "id",
          "name": "uuid_hash"
        }
      ]
    },
    "booking": {
      "column_vindexes": [
        {
          "column": "car_id",
          "name": "uuid_hash"
        }
      ]
    },
    "car_time_slot_lock": {
      "column_vindexes": [
        {
          "column": "car_id",
          "name": "uuid_hash"
        }
      ]
    }
  }
}

Apply the new VitessCluster manifest:

apiVersion: planetscale.com/v2
kind: VitessCluster
metadata:
  name: car-booking-service
spec:
 ...
 ...

  keyspaces:
  - name: car-booking-service
    ...
    partitionings:
    - equal:
        parts: 1
        shardTemplate:
          ...
          ...
          tabletPools:
          - cell: zone1
            type: replica
            replicas: 2
            vttablet:
            ...
            ...
    - equal:
        parts: 2
        shardTemplate:
          ...
          ...
          tabletPools:
          - cell: zone1
            type: replica
            replicas: 2
            vttablet:
            ...
            ...
  updateStrategy:
    type: Immediate

To proceed, connect to each new shard using a SQL client as outlined below. Then, apply the same tables provided earlier:

-80 means the shard contains rows where the shard key is in the range from the minimum possible value to the midpoint (below half of the keyspace):

mysql>use `car-booking-service:-80`;

80- means the shard contains rows where the shard key is in the range from the midpoint to the maximum possible value (upper half of the key space):

mysql>use `car-booking-service:80-`;

For the car-booking-service, we currently have three underlying databases (or six if replicas are also considered). One of these databases operates as an unsharded entity, catering to ongoing queries. The other two databases have been recently created. Our next step involves initiating a resharding workflow:

Resharding

vtctlclient Reshard -- --source_shards '-' --target_shards '-80,80-' Create car-listing-service.car2car

Cut-over

First, we transition the read-only and replica instances, followed by the primary instances:

vtctlclient Reshard -- --tablet_types=rdonly,replica SwitchTraffic car-listing-service.car2car
vtctlclient Reshard -- --tablet_types=primary SwitchTraffic car-listing-service.car2car

Cleanup the workflow

vtctlclient Reshard -- Complete car-listing-service.car2car

Down shard '-' (unsharded one)

Apply the new VitessCluster manifest:

apiVersion: planetscale.com/v2
kind: VitessCluster
metadata:
  name: car-booking-service
spec:
 ...
 ...

  keyspaces:
  - name: car-booking-service
    ...
    partitionings:
    - equal:
        parts: 2
        shardTemplate:
          ...
          ...
          tabletPools:
          - cell: zone1
            type: replica
            replicas: 2
            vttablet:
            ...
            ...
  updateStrategy:
    type: Immediate

Now equipped with a database featuring two partitions, we can efficiently handle queries, enhancing performance significantly. I hope you've gained valuable insights!

Thank you for reading & Happy Holidays!

If you like it, share it!

Developer Diary >_

Amila Senadheera

Amila Senadheera

Tags