3-Axis Data Collection Server

The questions below are due on Monday March 10, 2025; 05:00:00 PM.
 
You are not logged in.

Please Log In for full access to the web site.
Note that this link will take you to an external site (https://shimmer.mit.edu) to authenticate, and then you will be redirected back to this page.
Back to Exercise 05

Overview

In this assignment you will build a small web server using FastAPI that receives, stores, and serves 3-axis sensor data (think accelerometer: X, Y, Z). Your server will:

  1. Accept incoming data over HTTP/S (with a JSON body) at one endpoint
  2. Store it in a SQLite database
  3. Report on available datasets from a second endpoint
  4. Allow users to list and download stored datasets in CSV format from a third endpoint

In the next part, you will deploy this server alongside your ESP32-C3 data collection system and run it for at least ten collection sequences (details in the next assignment). You will submit both your code and your resulting database. The ultimate goal is to have a decent starting point for easy, mass data collection — useful for analyzing sensors and systems in deployment and for building your team projects.


Background

We have already talked about FastAPI, but refer back to Exercise 1 in case you need a refresher.

Required Imports

You will need the same libraries we used in the Week 1 assignment. I also think importing json and csv would be useful. I was also using pydantic when I wrote up my version, as well as some newer portions of the fastapi ecosystem. This can obviously vary for you as needed.


Endpoints

Your server must implement three endpoints:

1. POST /upload

This endpoint accepts a JSON body containing a sampling period and three series of floating-point data. When it receives valid data, it should:

  • Generate a timestamp for when the data was received
  • Store the data in the SQLite database
  • Return a confirmation response

Think about: What should happen if the JSON body is missing a field? Pydantic's BaseModel can help you validate the incoming data automatically — you will want to define a model class that describes the expected shape.

2. GET /datasets

This endpoint takes no parameters and returns a JSON object listing all datasets currently in the database. Each entry should include at least the timestamp and sampling period so the user knows what is available for download. The timestamp can then be used to query the third endpoint.

Think about: What SQL query gives you all rows?

3. GET /download/{timestamp}

This endpoint takes a timestamp as a path parameter and returns the corresponding dataset as a downloadable CSV file.

Think about: What happens if the user requests a timestamp that does not exist? What HTTP status code is appropriate?


Database

Use SQLite with the following schema:

conn.executescript("""
    CREATE TABLE IF NOT EXISTS datasets (
        id              INTEGER PRIMARY KEY AUTOINCREMENT,
        timestamp       TEXT    NOT NULL,
        sampling_period INTEGER NOT NULL,
        axis_x          TEXT    NOT NULL,
        axis_y          TEXT    NOT NULL,
        axis_z          TEXT    NOT NULL
    );
""")

Each entry has its primary key, a timestamp (which we will use to refer to and access a dataset), a sampling period (the space between samples in milliseconds), and three text fields to store the individual axes of data.

Note that each axis is stored as TEXT. Think about what format you should use to serialize a Python list of floats into a text column, and how to deserialize it back. The json module is your BFF here! This might seem a little silly, but honestly at the scale we are working with the performance hit will not be too bad, and we do not really need to be performing SQLite queries on the actual data — just the data about the data (the metadata).


JSON Input Format

The POST body targeting /upload should look like this:

{
  "sampling_period": 50,
  "series": [
    [-8.5974, 3.5310, 6.2744, ...],
    [-1.2345, 1.1030, 7.9810, ...],
    [0.9769, 0.1198, 0.0102, ...]
  ]
}

Where:

  • sampling_period is an integer representing the time between samples in milliseconds
  • series is a list of exactly 3 lists of floats, representing the X, Y, and Z axes respectively. Their lengths should be identical.

For example, at 50 ms sampling over 30 seconds, each axis would contain 600 samples — but your validation should not hardcode this number, since different runs might have different durations.

The timestamp should be generated server-side, not sent by the client. I used something like datetime.datetime.now().strftime("%Y_%m_%d_%H_%M_%S") so the timestamps are easy to put into a URL, but you do you — just make sure it is easy to use and consistent.


What Do I Have?

The /datasets endpoint should only accept GET requests with no query arguments (or ignore query arguments) and return a simple JSON object that lists all the current datasets collected. For example, mine yielded the following when it had four datasets in the database:

{
  "datasets": [
    {"timestamp": "2026_03_03_08_16_36", "sampling_period": 50, "samples_per_axis": 100},
    {"timestamp": "2026_03_03_08_16_22", "sampling_period": 50, "samples_per_axis": 100},
    {"timestamp": "2026_03_02_14_30_21", "sampling_period": 50, "samples_per_axis": 100},
    {"timestamp": "2026_03_02_14_30_14", "sampling_period": 100, "samples_per_axis": 200}
  ]
}

The format is not too particular, tbh — we just want something that can be queried to show what is available for download in a parsable way. Once a valid dataset is identified, the timestamp can be used to query the third endpoint for a download.


CSV Output

When a user hits the download endpoint (/download/{timestamp}), you should return a CSV file with the following columns:

sample_index time(ms) x y z
0 0 -8.5974 -1.2345 0.9769
1 50 3.5310 1.1030 0.1198
2 100 6.2744 7.9810 0.0102
... ... ... ... ...

The time(ms) column is computed as sample_index * sampling_period.

Here is the CSV construction and response code — use this in your download endpoint once you have retrieved the row from the database (note: not all my code is included here, just enough so you get the idea):

# Assuming row contains the data from a valid SQLite grab
axis_x = json.loads(row["axis_x"])
axis_y = json.loads(row["axis_y"])
axis_z = json.loads(row["axis_z"])
sampling_period = row["sampling_period"]

# Build CSV in memory
output = io.StringIO()
writer = csv.writer(output)
writer.writerow(["sample_index", "time(ms)", "x", "y", "z"])

for i in range(len(axis_x)):
    writer.writerow([i, i * sampling_period, axis_x[i], axis_y[i], axis_z[i]])

output.seek(0)

return StreamingResponse(
    iter([output.getvalue()]),
    media_type="text/csv",
    headers={"Content-Disposition": f"attachment; filename={timestamp}.csv"},
)

Study this code. Understand what StringIO does, why we call seek(0), and what StreamingResponse provides over a plain return. The Content-Disposition header is what tells the browser to download the response as a file rather than display it.


Testing Your Server

When you have something working, spin up your server and test it on localhost:8000 using Postman (or another tool if you prefer).

Here is some example JSON data you should be able to send to your /upload endpoint with no issues:

{
  "sampling_period": "50",
  "series": [
    [2.6263, 9.3453, -2.2568, -3.0696, 2.2692, -7.8342, 5.8553, 5.1306, 6.0145, 0.2056],
    [-7.5184, 5.7118, 4.4897, 4.2285, 6.8058, 4.0763, 4.2746, 9.6316, -1.4355, 3.9923],
    [7.3124, 6.1849, 4.7516, 9.4689, -7.0708, 8.2936, 6.9755, -5.4065, -1.4347, -0.2195]
  ]
}

(The full 100-element version is available in the test data file.)

Expected response:

{
  "message": "Stored successfully",
  "timestamp": "2026_03_02_12_55_10"
}

List datasets:

Visit http://localhost:8000/datasets:

{
  "datasets": [
    {
      "timestamp": "2026_03_02_12_55_10",
      "sampling_period": 50,
      "samples_per_axis": 100
    }
  ]
}

Download a dataset (use a timestamp from the list response):

Visit http://localhost:8000/download/2026_03_02_12_55_10:

sample_index,time(ms),x,y,z
0,0,2.6263,-7.5184,7.3124
1,50,9.3453,5.7118,6.1849
2,100,-2.2568,4.4897,4.7516
3,150,-3.0696,4.2285,9.4689
...

You can also visit http://localhost:8000/docs in a browser — FastAPI automatically generates interactive API documentation where you can test each endpoint directly.


Hints

  • The timestamp should be generated server-side when data is received, not sent by the client.
  • Remember that SQLite stores your axis data as text. You will need to serialize on the way in and deserialize on the way out.
  • When querying for a specific timestamp in the download endpoint, use parameterized queries (? placeholders) — never build SQL strings with f-strings or concatenation! (Remember Exercise 1.)
  • Your application should be in a single file (e.g., main.py). Keep it simple.

Deliverable

When done, upload:

  1. Your FastAPI application (main.py) — the complete server code you wrote.

Submit your functioning FastAPI file here
 No file selected

Once Done

Now you can send up large data series to the server in a mostly reliable fashion. In the next assignment you'll get your ESP targeting that endpoint.

Back to Exercise 05