The idea of Storysight is to show what can be built when taking advantage of the recent advancements in AI (specifically generative ML) and spatial computing to bring to life what an app in this new era could look like.
Our idea was to create an immersive experience around reading novels.
Firstly, to add extra visual spice to the book reading experience we decided to take the reader inside the chapters themselves. Stable diffusion models like MidJourney take in textual content and produce visual content, i.e. images, and books are full of textual content.
Secondly, we created a reading assistant who could support the reading experience by answering questions and highlighting interesting plot points.
Taking readers inside the story
Generating the panoramas
When riffing with the team early on, we saw that we could use GPT-4 to condense the text from a whole chapter into just a few lines that described the visual setting of the chapter that we could then use in a stable diffusion model.
MidJourney (at the time of writing) is limited in the sense it does not currently generate panoramas so the aspect ratios would not be ideal if we wanted to fully immerse the user with 360 degree, top to bottom environments.
Luckily there is a model that does exactly this: Skybox.
This allowed us to essentially pump the text from a chapter into a prompt for GPT-4 to then distil down to a smaller prompt that describes the main setting of the chapter for Skybox to then generate the panorama to drop the reader in.
Showing the environments in-app
Now we have the panoramas all we need to do is show it to the user while using the app.
To do this I used SwiftUI and RealityKit’s new `ImmersiveSpace` that has a nested `RealityView` which is essentially the 3D space the user sees in-app. In this `RealityView` I make a sphere that the user sits in the middle of and set the “material” of the sphere to be the panorama that Skybox generates.
This is great as the panorama maps nicely to the sphere (as this is what Skybox also uses on its web app) giving the user a sense of depth to the environment making it seem like more than just a static image.
This is the same thing the Storysight app is doing, meaning the panorama looks just as good in-app as it does in Skybox.
Creating a reading assistant
Another feature that we introduced into Storysight is the GPT powered reading assistant that is primed to only answer questions about the book the user is reading and does not reveal information from subsequent chapters or books, avoiding spoilers.
To prime the GPT model, I used OpenAI’s chat endpoint with a `system` prompt. This “system” prompt is what the model is “anchored” to so you can essentially set constraints for how the model responds to prompts, in this case it was to act as a reading assistant that only answered questions about the book and specifically the chapter the user is reading without revealing any information from beyond that chapter. This all happens programmatically in the background so the user never sees this taking place.
Additionally as a small UX win I added some suggestions for what the reader might commonly want to know about such as a chapter summary, recap of events, literary devices, etc. When the user selects one the app inserts a well crafted prompt that fulfils that request. However, the user can of course enter their own prompts too and just like ChatGPT it remembers the context of the conversation.
We’re really excited about the possibilities for intelligent applications in a spatial interface. This is just the first of many demos we’ll be sharing. Follow along on Loomery’s social channels or get in touch to find out more.