+2 votes

I'd like to be able to read in a GT data file directly within R, saving me the hassle of downloading a CSV locally and putting it in the right folder.

Is there any way to get a direct link to the GT CSV, or automate the process of reading in data
via R?

I've tried something like:

download.file(url = "https://www.guidedtrack.com/programs/17783/csv", destfile = "test.csv")

But the resulting file indicates I need authentication to do so. Is there an easier way of doing this?

by (480 points)

1 Answer

+3 votes
Best answer

Downloading the CSV necessarily means authentication, because GuidedTrack has to ensure that it's sending the data to someone who is authorized to read it. Not requiring authentication would effectively make all of your CSV data public, which wouldn't be responsible.

That said, I think it's possible to do what you want. There are probably many ways, but the one I found was to use httr, which is a wrapper for curl and has options for authentication.

The following ought to work to get your data into R, just replace GT_ACCOUNT_EMAIL and GT_ACCOUNT_PASSWORD with your account details:

library(httr)

GET("https://www.guidedtrack.com/programs/17783/csv", authenticate("GT_ACCOUNT_EMAIL", "GT_ACCOUNT_PASSWORD"))

The above probably needs more massaging to get a vector or something useful out of the data, but it should solve the immediate problem you're facing.

Another tricky bit is that the CSV is streamed to the client, since it might be a huge file. I hope httpr handles that correctly, but if it doesn't you might see your data truncated after 500 or so rows.

Edit: For people looking for info on how to write scripts that download the data for a GuidedTrack program, the solution is pretty much the same as above. Of course, you wouldn't be using R to write your scripts, but any language would have equivalent facilities.

Here's an example of using curl to write a bash script to achieve the same result:

curl -u "EMAIL:PASSWORD" https://www.guidedtrack.com/programs/PROGRAM_ID/csv

To adapt the above command for your use case, in the above replace EMAIL with the email associated with your GuidedTrack account, PASSWORD with your GuidedTrack password, and PROGRAM_ID with the ID of the program you want to download the data from. You can find the ID by opening the program and looking at the address bar of your browser – if the URL is something like https://www.guidedtrack.com/programs/123/ or https://www.guidedtrack.com/programs/123/edit, then the program ID is 123.

by (1.8k points)
edited by
Welcome to Guidedtrack Q&A, where you can ask questions and receive answers from other members of the community.
134 questions
144 answers
55 comments
40 users