The LLM consumes about 4gb of memory. This fix will ensure the model is only allocated when the chat is opened.