Trouble with handling files in Netlify Function

Hi! I’m working on a Netlify Function where we take form data for a job application (including a file upload) and pass the data on to a third-party API for use in their system. I was following along with this handy post (thanks!) —

— but seem to have run into a situation where the data in the file is not handled properly (for example, PDFs turn up with blank content, though ASCII metadata appears to be at least partly intact), at least when using the Netlify CLI; I have yet to try on a deploy preview. Writing to a local directory confirms that the issue isn’t with the third party API. Is there something I’m missing when working with these files? Example code below (note that I’ve also attempted to work with the Buffer data, with identical results).

Netlify site name, if you need it: transformdataio

Fetch function to call the Netlify Function:

const data = new FormData(form);

fetch('/.netlify/functions/apply', {
  method: 'POST',
  body: data, 
}).then(res => {
  if (!res.ok && res.status !== 406) {
    throw new Error('oh no');
  }

  return res.json();
}).then(data => {
  if (Array.isArray(data.missingRequiredFields) && data.missingRequiredFields.length > 0) {
    console.log(data);
    showMissingFields(data.missingRequiredFields);
  } else {
    showConfirmationMessage(data.message);
  }
}).catch(err => {
  showWarningMessage('Something went wrong; please try again.');
}).finally(() => {
  submitButton.removeAttribute('disabled');
});

And here’s our Netlify Function:

const path = require('path');
const os = require('os');
const fs = require('fs');
const request = require('request-promise-native');
const Busboy = require('busboy');

const apiKey = Buffer.from(`${process.env.ASHBY_API_KEY}:`, 'ascii').toString('base64');
const apiUrl = 'https://api.ashbyhq.com/applicationForm.submit';

const REQUIRED_FIELD_NAMES = [
  '_systemfield_email',
  '_systemfield_name',
  '_systemfield_resume'
];

const parseMultipartForm = (event) => {
  return new Promise((resolve) => {
    const fields = {};
    
    const busboy = new Busboy({
      headers: event.headers
    });
    
    busboy.on(
      'file',
      (fieldname, filestream, filename, transferEncoding, mimeType) => {
        const saveTo = path.join(os.tmpdir(), path.basename(fieldname));
        filestream.pipe(fs.createWriteStream(saveTo));
        fields[fieldname] = {
          filename,
          filepath: saveTo
        };
      }
    );
    
    busboy.on('field', (fieldname, value) => {
      fields[fieldname] = value;
    });
    
    busboy.on('finish', () => {
      resolve(fields);
    });
    
    busboy.write(event.body);
  });
};

exports.handler = async (event, context) => {
  const fields = await parseMultipartForm(event);
  const fieldSubmissions = [];
  const missingRequiredFields = [];
    
  // @see https://developers.ashbyhq.com/reference#applicationformsubmit for construction details
  for (const [path, value] of Object.entries(fields)) {
    if (path === 'jobPostingId' || path === 'g-recaptcha-response') {
      continue;
    }
    
    if (path === '_systemfield_resume') {
      if (value.filename === '') {
        missingRequiredFields.push(path);
      }
      
      fieldSubmissions.push({
        path,
        value: 'resume_1'
      });
    } else {
      if (REQUIRED_FIELD_NAMES.includes(path) && value === '') {
        missingRequiredFields.push(path);
      }
      fieldSubmissions.push({ path, value });
    }
  }
  
  if (missingRequiredFields.length > 0) {
    return {
      statusCode: 406, // "Not Acceptable"
      body: JSON.stringify({
        message: 'missing one or more required fields',
        missingRequiredFields
      })
    }
  }
  
  const { filename, filepath } = fields._systemfield_resume;
  
  const options = {
    'method': 'POST',
    'url': apiUrl,
    'headers': {
      'Content-Type': 'multipart/form-data',
      'Authorization': `Basic ${apiKey}`
    },
    formData: {
      'applicationForm': JSON.stringify({ fieldSubmissions }),
      'resume_1': {
        'value': fs.createReadStream(filepath),
        'options': {
          'filename': filename,
          'contentType': null
        }
      },
      'jobPostingId': fields.jobPostingId
    }
  };
  
  return await request(options)
    .then((res) => {
      console.log(res);
      return {
        statusCode: 200,
        body: JSON.stringify({ message: `Thanks, ${fields._systemfield_name}!` })
      };
    })
    .catch((err) => {
      console.error('ASHBY_ERROR', err);
      return {
        statusCode: 500,
        body: JSON.stringify({ message: 'Something went wrong; please try again.' })
      };
    });
};

Any guidance you might have would be greatly appreciated! (Worst case I’m willing to try to convert this to use Netlify Forms & a submission-created event to forward everything, but we don’t need any of this data stored in Netlify’s systems.)

1 Like

Unlike the example in the post you linked to, you have omitted the mimeType from the code snippet above.
Does adding this in change the outcome?

Hi, have you solved this issue? I’m seeing the same behaviour in my function.

Hi @oliviernt

Could you try using lambda-multipart-parser to see if it makes a difference? It depends on busboy but I personally find it easier to use.

It can be used like:

const MultipartParser = require('lambda-multipart-parser')
exports.handler = async event => {
  return MultipartParser.parse(event).then(formData => {
    console.log(formData)
  }
}
1 Like

Hi @hrishikesh

thank you for your help. I’ve tried it out but the result is the same. Somehow the image stream seems to be corrupt or somehow missing information on arrival…

In that case, I think the best bet is to use an external storage solution like S3 and connect to it using Lambda.

That’s actually what I tried to achieve with Netlify. But I have now changed the logik to only use the Netlify function to get the signed URL from AWS S3 and then directly upload the image to S3 from the client…

No, it didn’t solve the issue.

I am also facing the same exact issue mentioned in the question. Followed this → How to Parse Multipart Form Data with Netlify Functions. The PDF file comes as empty when using netlify function. Txt files work fine. There is a problem with Netlify Function. The file data automatically gets corrupted. Can you check and give the fix for it?

Hi @Coderr,

I am not able to reproduce this issue. Till now, I had not tested it myself but now I did. I uploaded a PDF file via Postman to a serverless function I created and stored the file in Firebase and it worked fine.

As you can see, the size is exactly the same.

I even got the same MD5 hash for both the files (the original and the one I downloaded from Firebase). Here’s my code:

import { initializeApp } from 'firebase/app'
import { getStorage, ref, uploadString } from 'firebase/storage'

exports.handler = async event => {
  if (event.isBase64Encoded) {
    const storage = getStorage(initializeApp({
      /* firebase stuff */
    }))
    const somePDF = ref(storage, 'file.pdf')
    return uploadString(somePDF, event.body, 'base64').then(() => {
      return {
        body: JSON.stringify('Uploaded successfully'),
        statusCode: 200
      }
    })
  }
}

I simply sent a POST request to this endpoint with Postman. If you still think it’s not working for you, could you help us with an example PDF that’s not working or a reproduction case to test?

1 Like

Yes, it is working fine when sending from the postman but it is not working when sending the data by code. You can replicate the above linked Multipart Form Data with Netlify function post.

I think in that case the problem would be that since Forms are encoded with Multipart content header, they’re excluded from the base64 encoding of the data.

Could you try to perform base64 encoding on the client-side and send that along with the form data and see how it goes?

It is not working. Have you tried on your end?

Could you post your code example that’s not working?

Here is the netlify function code.

const Busboy = require("busboy");
const FormData = require("form-data");
const fetch = require("node-fetch");

function parseMultipartForm(event) {
  // source: https://www.netlify.com/blog/2021/07/29/how-to-process-multipart-form-data-with-a-netlify-function/
  return new Promise(resolve => {
    const fields = {};

    const busboy = new Busboy({
      // uses request headers to extract the form boundary value
      headers: event.headers,
    });

    busboy.on("file", (fieldname, filestream, filename, transferEncoding, mimeType) => {
      // ... we take a look at the file's data ...
      filestream.on("data", data => {
        fields[fieldname] = {
          filename,
          type: mimeType,
          content: data,
          transferEncoding,
        };
      });
    });

    // whenever busboy comes across a normal field ...
    busboy.on("field", (fieldName, value) => {
      // ... we write its value into `fields`.
      fields[fieldName] = value;
    });

    // once busboy is finished, we resolve the promise with the resulted fields.
    busboy.on("finish", () => {
      resolve(fields);
    });

    // now that all handlers are set up, we can finally start processing our request!
    busboy.write(event.body);
  });
}

exports.handler = async function(event, context) {
  // parse the incoming multipart/form-data data into fields object
  const fields = await parseMultipartForm(event);

  // create new formdata object to be send to Lever
  const form = new FormData();

  for (const [key, value] of Object.entries(fields)) {
    if (key === "resume") {
      // append "resume" with the file buffer and add the file name
      form.append("resume", value.content, { filename: value.filename });
    } else {
      form.append(key, value);
    }
  }
};

When console log the event.body, it has more size for pdf file than the original uploaded file.

Found one thread regarding multipart/form data → Netlify server functions unable to handle multipart/form-data

Hi @Coderr,

It’s because the event.body is base64 encoded.

You need to use Buffer.from(event.body, 'base64').toString('utf8').

Full code:

import { initializeApp } from 'firebase/app'
import { getStorage, ref, uploadString } from 'firebase/storage'

const Busboy = require('busboy')

exports.handler = async event => {

  return new Promise(resolve => {

    const fields = {}

    const busboy = new Busboy({
      headers: event.headers
    })

    busboy.on('file', (fieldname, filestream, filename, _, mimeType) => {
      filestream.on('data', data => {
        fields[fieldname] = {
          content: data,
          filename,
          type: mimeType
        }
      })
    })

    busboy.on('field', (fieldName, value) => {
      fields[fieldName] = value
    })

    busboy.on('finish', () => {
      resolve(fields)
    })

    busboy.write(Buffer.from(event.body, 'base64').toString('utf8'))

  }).then(formData => {

    return uploadString(ref(getStorage(initializeApp({
      /* firebase stuff */
    })), formData.name), formData.file.content.toString('base64'), 'base64').then(() => {

      return {
        body: JSON.stringify(true),
        statusCode: 200
      }

    })

  })

}

Note: This is my form:

<form enctype = "multipart/form-data" method = "POST" name = "fileForm">
  <p>
    <label>
      File name:
      <input type = "text" name = "name"/>
    </label>
  </p>
  <p>
    <label>
      File:
      <input type = "file" name = "file"/>
    </label>
  </p>
  <p>
    <button>
      Send
    </button>
  </p>
</form>

So if your form has different fields and names (it most likely will), you’d have to make adjustments in your serverless code.

After adding Buffer.from(event.body, ‘base64’).toString(‘utf8’) in busboy.write, I am getting below timed out error. Is the PDF content working on your above code?

{"errorMessage":"Task timed out after 10.00 seconds","errorType":"TimeoutError","stackTrace":["new TimeoutError (lib/node_modules/netlify-cli/node_modules/lambda-local/build/lib/utils.js:112:28)

This time I tried a simple 20 KB YML file as that was readily available on my Desktop. I was able to get the file in my Firebase Storage and it was MD5 perfect. So I assumed it would work for other files too.

How big is the PDF that you’re trying to use?

Also, I think the error you’ve got is in Netlify CLI. Netlify CLI doesn’t encode the event.body as base64 (yet - I’ll be filing as issue for that later today). So, in Netlify CLI, you have to use it like before, the above code would only work in Production.

If you wish to support both the environments, you would have to do something like:

if (event.isbase64Encoded) {
  busboy.write(Buffer.from(event.body, 'base64').toString('utf8'))
} else {
  busboy.write(event.body)
}

OR one liner (might not work):

busboy.write(event.isbase64Encoded ? Buffer.from(event.body, 'base64').toString('utf8') : event.body)

I will try the above. I also used simple pdf file size like 50kb.