There are a number of tools that can be used to create XSeen content. The tool(s) that you need depend on what you want to do and what is already available for your use.
The THREE.js editor can be used to assemble components into a scene. The output is a THREE JSON file that is loaded into the XSeen display. That can load additional resources such as textures or models.
The primary model format is glTF. This is a widely adopted format for models that includes animation and physically-based rendering. When starting out, it is best to use existing models ([example link]). You can create your own with tools like Blender and Maya. You may need to get the appropriate plug-in for the glTF exporter.
Textures and other images are created with image manipulation programs – Photoshop, GIMP, Paint, etc. You want to save your work into a browser-compatible format — JPEG or PNG.