The github explains the format fairly well. I've added info from my eBook content after saving the Pages version as plain text. I created the train.json file by breaking the text file into lines an added '{"text": "My text info"}' around each line My text info. I fudged the valid.json file by including a small part of the train.json. The biggest problem was finding special characters in my original that weren't json compatible (e.g. tab character). This was just for adding 'text' content to the model using LoRa.
Topic:
Machine Learning & AI
SubTopic:
General
Tags: