🚀 Getting Onto the Same Page

Recently, I had to set up an FTP server for the first time and integrate it with an existing Node.js backend service. I'm writing this case study not only to document the process clearly for myself but also to gather feedback from experienced engineers regarding potential drawbacks and areas for improvement.

But first, a bit of wisdom from Wikipedia (I certainly had to read up a little there 😅):

The File Transfer Protocol (FTP) is traditionally used to transfer files between a client and a server over a network. Similar to HTTP, plain FTP is unencrypted. Therefore, secure variants like FTPS (FTP over SSL/TLS) or SFTP (SSH File Transfer Protocol) are commonly preferred.

In this case study, we'll explore the integration of a secure FTP workflow, specifically SFTP directly into an existing Node.js HTTP service.


📌 How Did We Get Here?

We needed to allow an external third-party service to regularly upload XML content into our system. Although a standard HTTP POST endpoint was initially considered, the vendor specifically required FTP as their transfer method.

Therefore, our solution involved:

  • Setting up an SFTP server securely on our existing Ubuntu (AWS EC2) instance to handle XML uploads.
  • Integrating a file-watching mechanism (Chokidar) directly into our existing Node.js application (managed with PM2), avoiding extra complexity and centralizing schemas and logic.
  • Automatically detecting uploaded files, parsing their XML contents, inserting the parsed data into our database, and cleaning up uploaded files to maintain optimal storage usage.

🛠️ The Arsenal Room

We decided to use the following tools to develop the pipeline:

  • 🔒 SFTP via OpenSSH: Secure, encrypted file uploads
  • 📂 Linux ACL (Access Control Lists): Precise file-permission control
  • 🌐 Node.js & PM2: Centralized app server and process management
  • 👀 Chokidar: Integrated file watcher within our Node.js app
  • 📑 xml2js: XML-to-JSON parsing utility
  • 💾 MongoDB & Prisma: Centralized database and schema management

🎯 The Mission Statement (High-Level Overview)

With the tools aside, here was our high level plan:

  1. Install Dependencies

    • OpenSSH and ACL utilities on Ubuntu
    • Node.js packages (chokidar, xml2js, Prisma dependencies)
  2. Create Secure, SFTP-Only User

    • Dedicated SFTP user restricted to a chroot directory
  3. Configure Directory Permissions

    • Allow uploads by the SFTP user
    • Allow Node.js (Ubuntu user) to delete processed files via ACL
  4. Integrate Watcher Logic within Existing Node.js Application

    • No separate PM2 instance needed; direct integration
    • Centralized access to Prisma/MongoDB schema

💻 Here's How We Did It

Let's get our hands dirty with some code and commands.

Step 1: 🔧 Installing System Dependencies

First we had to install the necessary linux tools in our server. Since we used ubuntu in our specific case, they ware available via apt.

sudo apt-get update
sudo apt-get install openssh-server acl

Step 2: 🔐 The Locks and Keys

Then we had to create a dedicated user for FTP who can only access the upload directory. We also had to allow the default user (ubuntu in our case), who runs the Node.js server via PM2 to delete files in that directory.

Create SFTP group and user:

sudo groupadd sftpgroup
sudo useradd -g sftpgroup -s /usr/sbin/nologin sftpuser
sudo passwd sftpuser

Setup Directory Structure:

sudo mkdir -p /srv/sftp/sftpuser/uploads

Set Essential Permissions:

  • Root-owned chroot directory:
sudo chown root:root /srv/sftp/sftpuser
sudo chmod 755 /srv/sftp/sftpuser
  • Uploads folder owned by sftpuser:
sudo chown sftpuser:sftpgroup /srv/sftp/sftpuser/uploads
sudo chmod 755 /srv/sftp/sftpuser/uploads
  • Grant deletion permissions to Node.js process (ubuntu) via ACL:
sudo setfacl -m u:ubuntu:rwx /srv/sftp/sftpuser/uploads
sudo setfacl -d -m u:ubuntu:rwx /srv/sftp/sftpuser/uploads

Verify Permissions:

getfacl /srv/sftp/sftpuser/uploads

This setup ensures:

  • Secure uploads by the SFTP user
  • File deletion capability by Node.js (ubuntu) via ACL
  • No conflicts with SSH login permissions for the ubuntu user

Step 3: 🔒 Configure OpenSSH for SFTP-Only Access

Then, to make sure the new user cannot access the entire machine, we have to configure OpenSSH so that the it can only access the server via SFTP. To do that open /etc/ssh/sshd_config with a text editor and append the following:

Subsystem sftp internal-sftp

Match User sftpuser
  ChrootDirectory /srv/sftp/%u
  ForceCommand internal-sftp
  X11Forwarding no
  AllowTcpForwarding no
  PasswordAuthentication yes

The restart SSH service for the changes to take effect:

sudo systemctl restart ssh

Step 4: 👀 The Silent Watcher

Finally we had to integrate the watcher logic in our existing Node.js server. For that, we decided to use chokider, which will constantly monitor the upload directory. As soon as a file appears there, it parses it to JSON and inserts into the database. Finally it deletes the file itself as it is no longer needed.

Install Node.js packages:

npm install chokidar xml2js

Add watcher logic directly in your main Node.js file (e.g., index.js):

Example Integration:

import chokidar from "chokidar";
import fs from "fs";
import { parseStringPromise } from "xml2js";
import prisma from "../config/prisma.js";

// Directory to watch:
const WATCH_DIR = "/srv/sftp/sftpuser/uploads"; // or wherever your upload dir is

function initFileWatcher() {
  const watcher = chokidar.watch(WATCH_DIR, {
    persistent: true,
    ignoreInitial: true, // ignore existing files at startup
    awaitWriteFinish: {
      stabilityThreshold: 2000, // amount of time (in ms) to confirm no more writes
      pollInterval: 100, // how often to poll file size changes
    },
  });

  watcher.on("add", async (filePath) => {
    // eslint-disable-next-line no-console
    console.log("New file detected:", filePath);
    try {
      // 1. Read file from disk
      const xmlContent = fs.readFileSync(filePath, "utf8");

      // 2. Parse XML
      const jsonData = await parseStringPromise(xmlContent);

      // 3. Insert into DB
      await prisma.biuContent.create({
        data: { body: JSON.stringify(jsonData) },
      });

      // eslint-disable-next-line no-console
      console.log("Processed file:", filePath);
    } catch (err) {
      // eslint-disable-next-line no-console
      console.error("Error processing file:", filePath, err);
    } finally {
      // 4. Delete file from disk (if desired)
      // eslint-disable-next-line no-console
      console.error("Deleted file:", filePath);
      fs.unlinkSync(filePath);
    }
  });

  watcher.on("error", (error) => {
    // eslint-disable-next-line no-console
    console.error("Watcher error:", error);
  });

  // eslint-disable-next-line no-console
  console.log(`Watcher initialized. Now monitoring: ${WATCH_DIR}`);
}

export default initFileWatcher;
  • Directly integrated within the existing Node.js app
  • Maintains consistency and centralizes schema logic

⚠️ The Devil is in the Details

Some gotchas to watch out for.

  • Important: Do NOT add the Node.js user (ubuntu) to the SFTP chroot group (sftpgroup)—this breaks SSH login (been there, done that 😅)!
  • Always explicitly set ACL permissions to avoid permission conflicts.
  • Regularly verify ACL settings with getfacl.
  • Use ssh -vvv username@host and ftp -vvv username@host (verbose mode flags) to debug any potential permission issues.

📚 Resources on my reading list as a follow up.


🎉 Conclusion

By combining all these basic tools, we managed to quickly build a MVP solution within a couple of hours. We also made the system robust and secure to our best knowledge by carefully managing the access of the FTP user. I am open to hearing any flaw or issues with the system, so any advise are much appreciated!