Json file not being updated on Deployments

Website name: hoopers.club (using astro)

What I’m trying to do:
I have an articles page on my website. It scrapes other platforms to create a list of articles that redirects users to the respective article’s platform so user can read them.

How I’m doing it:
I have a file “main.js” where it scrapes the platforms to retrieve informations such as articles titles, url, photo and author, and pushes the data to json files:

const axios = require('axios');
const cheerio = require('cheerio');
const fs = require('fs');
url = 'https://www.nbaportugal.com/category/artigos/';
urlslam = 'https://www.slamonline.com/category/nba/';
ballurl = 'https://ballislife.com/news/';
realgmurl = 'https://basketball.realgm.com/nba/news';

const getNbaportugal = () => {
	axios(url)
		.then((response) => {
			const html = response.data;
			const $ = cheerio.load(html);
			const articles = [];
			$('article', html).each(function () {
				const title = $(this).find('a').attr('title');
				const url = $(this).find('a').attr('href');
				const image = $(this).find('a').children().attr('src');
				const author = $(this).find('.byline a').text();
				articles.push({ title, url, image, author });
			});

			let data = JSON.stringify(articles);

			fs.writeFileSync('nba.json', data);
		})
	.catch((err) => console.log(err));
};

const slam = () => {
	axios(urlslam)
		.then((response) => {
			const html = response.data;
			const $ = cheerio.load(html);
			const articles = [];
			$('.blog-post-vert', html).each(function () {
				const title = $(this).find('h3').text();
				const url = $(this).find('a').attr('href');
				const author = $(this)
					.find('.blog-meta')
					.text()
					.match(/(?<=[Bb]y.).+?(?=..\s\s)/)[0];

				const reg = /\(([^)]+)\)/;
				const image = reg.exec(String($(this).find('a').attr('data-bg')))[1];
				articles.push({ url, image, title, author });
			});

			let data = JSON.stringify(articles);

			fs.writeFileSync('slam.json', data);
		})
	.catch((err) => console.log(err));
};

const ballislife = () => {
	axios(ballurl)
		.then((response) => {
			const html = response.data;
			const $ = cheerio.load(html);
			const articles = [];
			$('article', html).each(function () {
				const title = $(this).find('a').attr('title');
				const url = $(this).find('a').attr('href');
				const image = $(this).find('a').children().attr('src');
				const author = $(this).find('.author').text();
				articles.push({ title, url, image, author });
			});

			let data = JSON.stringify(articles);

			fs.writeFileSync('ballislife.json', data);
		})
	.catch((err) => console.log(err));
};

const realgm = () => {
	axios(realgmurl)
		.then((response) => {
			const html = response.data;
			const $ = cheerio.load(html);
			const articles = [];
			$('.article', html).each(function () {
				const title = $(this).find('.article-title').text();
				const url =
					'https://basketball.realgm.com' + $(this).find('a').attr('href');
				const image = $(this).find('img').attr('src')
					? 'https://basketball.realgm.com' + $(this).find('img').attr('src')
					: 
                                  'https://basketball.realgm.com/images/basketball/5.0/template/basketball-icon.gif';
				const author = $(this)
					.find('.article-source')
					.text()
					.match(/(?<=\n).+?(?=\n)/g);
				articles.push({ title, url, image, author });
			});

			let data = JSON.stringify(articles);

			fs.writeFileSync('realgm.json', data);
		})
	.catch((err) => console.log(err));
};

getNbaportugal();
slam();
ballislife();
realgm();

I then import those json files on my articles.astro

Issue: If I locally run “npm run build” it builds correctly. Updates json files and it’s fine. When I the site runs automatic builds and deploys, or when I make changes and push the changes to github, deploying it to netlify, it doesn’t update my json files on github.

Not sure what I might be doing wrong.

Interesting behaviour, even though it doesn’t update json files on github, the deployed website show the correct data. I am not satisfy with the result, tho as I need json files to be updated on github, so I can work locally too.

Sorry if I didn’t explain myself correctly. Hope I can get some help to better understand what’s going on.

Deploy Log:

9:07:25 AM: Build ready to start
9:07:27 AM: build-image version: d2c6dbeac570350a387d832f64bc980dc964ad65 (focal)
9:07:27 AM: build-image tag: v4.8.0
9:07:27 AM: buildbot version: e552b142336b2b1222a93a4fd4cbed0019c77b46
9:07:27 AM: Fetching cached dependencies
9:07:27 AM: Starting to download cache of 1.6GB
9:07:44 AM: Finished downloading cache in 17.641163228s
9:07:44 AM: Starting to extract cache
9:08:05 AM: Finished extracting cache in 20.829879179s
9:08:05 AM: Finished fetching cache in 38.63284154s
9:08:05 AM: Starting to prepare the repo for build
9:08:05 AM: Preparing Git Reference refs/heads/main
9:08:07 AM: Parsing package.json dependencies
9:08:08 AM: Starting build script
9:08:08 AM: Installing dependencies
9:08:08 AM: Python version set to 2.7
9:08:09 AM: Started restoring cached node version
9:08:11 AM: Finished restoring cached node version
9:08:11 AM: v16.14.2 is already installed.
9:08:11 AM: Now using node v16.14.2 (npm v8.5.0)
9:08:11 AM: Started restoring cached build plugins
9:08:11 AM: Finished restoring cached build plugins
9:08:12 AM: Attempting ruby version 2.7.2, read from environment
9:08:12 AM: Using ruby version 2.7.2
9:08:13 AM: Using PHP version 8.0
9:08:13 AM: No npm workspaces detected
9:08:13 AM: Started restoring cached node modules
9:08:13 AM: Finished restoring cached node modules
9:08:13 AM: Started restoring cached go cache
9:08:13 AM: Finished restoring cached go cache
9:08:13 AM: go version go1.16.5 linux/amd64
9:08:13 AM: go version go1.16.5 linux/amd64
9:08:13 AM: Installing missing commands
9:08:13 AM: Verify run directory
9:08:15 AM: ​
9:08:15 AM: ────────────────────────────────────────────────────────────────
9:08:15 AM:   Netlify Build                                                 
9:08:15 AM: ────────────────────────────────────────────────────────────────
9:08:15 AM: ​
9:08:15 AM: ❯ Version
9:08:15 AM:   @netlify/build 26.5.2
9:08:15 AM: ​
9:08:15 AM: ❯ Flags
9:08:15 AM:   baseRelDir: true
9:08:15 AM:   buildId: 624bf8bd22a59d7285ce9798
9:08:15 AM:   deployId: 624bf8bd22a59d7285ce979a
9:08:15 AM: ​
9:08:15 AM: ❯ Current directory
9:08:15 AM:   /opt/build/repo
9:08:15 AM: ​
9:08:15 AM: ❯ Config file
9:08:15 AM:   /opt/build/repo/netlify.toml
9:08:15 AM: ​
9:08:15 AM: ❯ Context
9:08:15 AM:   production
9:08:15 AM: ​
9:08:15 AM: ❯ Loading plugins
9:08:15 AM:    - @netlify/plugin-sitemap@0.8.1 from netlify.toml and package.json
9:08:16 AM: ​
9:08:16 AM: ────────────────────────────────────────────────────────────────
9:08:16 AM:   1. Build command from Netlify app                             
9:08:16 AM: ────────────────────────────────────────────────────────────────
9:08:16 AM: ​
9:08:16 AM: $ npm run build
9:08:16 AM: > @example/blog@0.0.1 build
9:08:16 AM: > node public/main.js && astro build
9:08:20 AM: Nba Portugal's data: [{"url":"https://www.slamonline.com/nba/cavs-achieve-first-winning-season-since-2017-2018-campaign/","image":"https://d1l5jyrrh5eluf.cloudfront.net/wp-content/uploads/2022/03/GettyImages-1388281981-scaled.jpg","title":"Cavs Achieve First Winning Season Since 2017-2018 Campaign","author":"Brooks Warr"},{"url":"https://www.slamonline.com/nba/paul-george-upgraded-to-questionable-ahead-of-game-vs-utah-jazz/","image":"https://d1l5jyrrh5eluf.cloudfront.net/wp-content/uploads/2022/03/GettyImages-1239509501-scaled.jpg","title":"Paul George Upgraded to Questionable Ahead of Game Vs. Utah Jazz","author":"Brooks Warr"},{"url":"https://www.slamonline.com/nba/report-myles-turner-out-for-the-rest-of-the-season/","image":"https://d1l5jyrrh5eluf.cloudfront.net/wp-content/uploads/2022/03/GettyImages-1377704021-scaled.jpg","title":"REPORT: Myles Turner Out For the Rest of the Season","author":"Brooks Warr"},{"url":"https://www.slamonline.com/nba/report-robert-williams-suffers-meniscus-tear-out-indefinitely/","image":"https://d1l5jyrrh5eluf.cloudfront.net/wp-content/uploads/2022/03/GettyImages-1364700778-scaled.jpg","title":"REPORT: Robert Williams Suffers Meniscus Tear, Out Indefinitely","author":"Jerry Humphrey I"},{"url":"https://www.slamonline.com/nba/report-juwan-morgan-gets-call-up-to-celtics/","image":"https://d1l5jyrrh5eluf.cloudfront.net/wp-content/uploads/2022/03/GettyImages-1239375692-scaled.jpg","title":"REPORT: Juwan Morgan Gets Call-Up to Celtics","author":"Brooks Warr"},{"url":"https://www.slamonline.com/nba/thunder-rookie-josh-giddey-out-for-season/","image":"https://d1l5jyrrh5eluf.cloudfront.net/wp-content/uploads/2022/03/GettyImages-1238739864-scaled.jpg","title":"Thunder Rookie Josh Giddey Out For Season","author":"Nick Cra"}]
08:08:20 AM [config] Set "buildOptions.site" to generate correct canonical URLs and sitemap
9:08:48 AM: dasdasd[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
9:08:48 AM: 08:08:48 AM [build] 179 pages built in 28.45s (159ms/page)
9:08:48 AM: 08:08:48 AM [build] 🚀 Done
9:08:48 AM: ​
9:08:48 AM: (build.command completed in 32.6s)
9:08:48 AM: ​
9:08:48 AM: ────────────────────────────────────────────────────────────────
9:08:48 AM:   2. @netlify/plugin-sitemap (onPostBuild event)                
9:08:48 AM: ────────────────────────────────────────────────────────────────
9:08:48 AM: ​
9:08:48 AM: Creating sitemap from files...
9:08:49 AM: Sitemap Built! sitemap.xml
9:08:49 AM: ​
9:08:49 AM: (@netlify/plugin-sitemap onPostBuild completed in 141ms)
9:08:49 AM: ​
9:08:49 AM: ────────────────────────────────────────────────────────────────
9:08:49 AM:   3. Deploy site                                                
9:08:49 AM: ────────────────────────────────────────────────────────────────
9:08:49 AM: ​
9:08:49 AM: Starting to deploy site from 'dist'
9:08:50 AM: Creating deploy tree asynchronously
9:08:50 AM: Creating deploy upload records
9:08:52 AM: 4 new files to upload
9:08:52 AM: 0 new functions to upload
9:08:52 AM: Site deploy was successfully initiated
9:08:52 AM: ​
9:08:52 AM: (Deploy site completed in 3.8s)
9:08:53 AM: ​
9:08:53 AM: ────────────────────────────────────────────────────────────────
9:08:53 AM:   Netlify Build Complete                                        
9:08:53 AM: ────────────────────────────────────────────────────────────────
9:08:53 AM: ​
9:08:53 AM: (Netlify Build completed in 37.9s)
9:08:53 AM: Caching artifacts
9:08:53 AM: Started saving node modules
9:08:53 AM: Finished saving node modules
9:08:53 AM: Started saving build plugins
9:08:53 AM: Finished saving build plugins
9:08:53 AM: Started saving pip cache
9:08:53 AM: Finished saving pip cache
9:08:53 AM: Started saving emacs cask dependencies
9:08:53 AM: Finished saving emacs cask dependencies
9:08:53 AM: Started saving maven dependencies
9:08:53 AM: Finished saving maven dependencies
9:08:53 AM: Started saving boot dependencies
9:08:53 AM: Finished saving boot dependencies
9:08:53 AM: Started saving rust rustup cache
9:08:53 AM: Starting post processing
9:08:53 AM: Finished saving rust rustup cache
9:08:53 AM: Started saving go dependencies
9:08:53 AM: Finished saving go dependencies
9:08:53 AM: Build script success
9:08:54 AM: Post processing - HTML
9:08:56 AM: Post processing - header rules
9:08:57 AM: Post processing - redirect rules
9:08:57 AM: Post processing done
9:09:00 AM: Site is live ✨
9:11:02 AM: Finished processing build request in 3m35.067058039s

The data displayed in the console.log at “9:08:20 AM:” is the updated data that is not being pushed on to github’s json file

Hey @hoopersclub,

Are you doing something to push the data to GitHub? Netlify is not going to update your repo, so you might have to do a git push yourself. However, if you’re building from the same repo to push it to, make sure, you don’t lead into circular build-deploy loop.

I know now that to do so I would have to use axios or something similar to update the json file on github but since it’s working just fine without it because the json file is being updated locally on netlify serve, so no problem.

I guess if I need to debug anything related to it locally I can just run node "filename.js" and the js file will be updated locally.

Hello

Looking at what you are trying to achieve it looks like you are relying on our build system to run your script first and then use that updated data to make a build.

Can you walk me through the strategy here? When do you actually need to use Netlify?

  • When you want to commit new changes to your website because you changed something?
  • or each day to “update” the articles list?

If you want periodic updates may I recommend running a script locally that generates new version of the JSONs and commit them to your repo? That way Netlify will handle this commit and build a new version of your website.

Rather than relying on netlify to update the JSONs, you provide the updated JSONs to Netlify through a standard commit.

3 Likes

Well, ideally both. I want netlify to build everytime I change something on the website via github AND building the website every 8 hours so that it runs main.js where I have a bunch of functions to run every 8 hours.

Not sure how to achieve such script locally :grimacing:

Hey @hoopersclub,

I think we need to take a step back and try to analyse this thread piece by piece. Here’s what we’ve understood so far:

  1. You are running a script to update a JSON file.
  2. You want to update that file in your Git repo, but we don’t know why. Maybe you can elaborate on the specific reason you wish to do so, but if it’s what we assume it is (point 3), you can read on.
  3. You’re simply trying to fetch the “updated” file from GitHub to serve in your site.

If these assumptions are true, this is what you can do:

Instead of:

if you change the line to:

fs.writeFileSync('./dist/realgm.json', data)

That file will det published in your website as https://example.netlify.app/realgm.json, which you can access directly and remove GitHub entirely from the equation.

The benefits of doing this? Sure, here they are:

  1. You can easily manage updating the file on the website no matter how you deploy - by pushing to GitHub, or by automatically triggering a build after a specific time.
  2. You can make your repo private if needed, as you won’t have to rely on GitHub to serve the file.
  3. You can avoid more complicated situations by not having to prevent circular loop when pushing back to GitHub.

Let us know if this helps or please correct us if we’ve made an incorrect assumption on what you’re trying to do.

Note that, if you still wish to push the file to GitHub, there’s a way to do it, but it’s going to be more complicated than needed. So, we would most likely suggest alternative ways in that scenario.

2 Likes

The only reason I wanted to update the json file is, I have the website already in production, but we are still developing new features and improving existing ones and when I’m working on locally it displays the data from the local JSON (the one on github which if is not updated displays different data).

Image the articles page. My website displays x number of articles. But locally I see x - y number of articles, being y the number of articles added after the last github json file update.

So, it’s just for debugging and testing new stuff that I would need that json file update. I can instead run node file-that-updates-json-locally.js and it works.

hoopersclub,

is there anything preventing you from running a push after you edit the json file?

here you say “it’s just for debugging and testing new stuff that I would need that json file update.” which seems like a pretty occasional and also clearly timebound thing.

i worry a bit that you are trying to overengineer a solution here, and that you might be better off with sticking with something simpler until your use case evolves.

we’re going to duck out of this conversation now, but do let us know if you have any other questions in the future!

I’m sorry @perry I think I didnt express myself correctly, perhaps not being on my mother tongue :sweat_smile: :grimacing:

At first I wasn’t understanding why I was getting the right data on production when locally I wasn’t. I didn’t understand how. Then, I guess at astro discord someone explained me something like this “because the json file is being updated locally on netlify server” which I wrote some posts ago and I learnt that I could work around it with a simple node filename.js.

From that point on I was answering your questions so I could also understand better how it works. Thanks to all.

1 Like