I’m updating my post to address the “flagged as spam” status.
Why are you flagging a project from your own competition as spam?..
Just in case anyone’s curious about how this works
here’s a video link: https://www.youtube.com/watch?v=M4KzKPboSUY
Hi, very cool idea and congratulations! I think we all need better tools to generate powerpoints or documents using AI. I just tried it out and have some feedback for you:
-
It gives me a headache that as it is creating the document, it keeps scrolling down like crazy until it finally creates the document. I don’t have a screen recording but you should know what I’m talking about.
-
Docs formations aren’t created properly, i.e. headlines are vertically aligned. I have a product I made for accountants where I needed to generate AI reports. I face this issue as well. I’ve attached an image of this issue here:
- As it is generating images, it shows blanks. If the image has actually been created, show it. Otherwise just don’t show anything or say that an image is being created without giving the ability to download it. It seems like the image was generated during the flow but it doesn’t show up on web → had to download it seperately. Here’s a screenshot:
- Presentations shouldn’t have large lengths of texts nor should they be cutting off from the slide. Nobody reads more than 1-2 sentences of a presentation. It should focus on key details rather and perhaps it can add all these as part of the presentor’s notes?
Hope you found these helpful.
P.S. I launched my Gemini Submission publicly recently, so please give it a try and use it for your business if it suits you: Launching My Submission: HelperHat - 24/7 AI Live Chat Support For Your Website
Thank you so much for detailed feedback!
Some of the issues you mentioned (1, 4) can be controlled with options.
I aim to provide users with the best experience possible without requiring them to configure these options, as options can be a bit complicated.
However, since you’ve already gained a understanding through testing, let me introduce some of options.
1. Option to Hide middle steps
There’s an option to hide the middle steps in the processing sequence.
I thought it would be interesting for users to see how the output takes shape.
Although everything runs in parallel, nodes depend on previous responses, so retrieving from dozens to thousands of answers from Gemini can take about 30 seconds on average (about 3 seconds per depth) to get the final result.
I found it fun to watch the middle steps while waiting, but thanks to your feedback, I now realize it could be distracting.
I’ll consider hiding the middle steps by default.
Disable the “Use Chat Save”.
2. About the word file malformation
I use Gemini and custom filters to self-evaluate results and report to me any malformed or uncompilable outputs generated by users. However, still having hard time with controlling this uncertainty.
I’ve personally tested the Word format at least 1000 times and ran more test with testcase using Gemini, but I haven’t seen this case.
It seems like it might be caused by missing fonts. I’ll look into this. Thanks for catching it!
3. Image Generation Issues
I received an alert for this result. Image generation can sometimes get filtered due to regulations or other restrictions.
If that happens, my system retries five times with increasingly filtered prompts. If all five attempts are blocked, it defaults to generating a black image. I’ll work on making the image generation process more smoother.
For the image preview, I initially tried using local storage for preview, but it caused performance issues. To improve performance, I switched to serving images directly from the server cache, but this change made it challenging to offer a preview. I’ll keep looking into solutions to improve this experience as well.
4. Controlling Output Tokens for PowerPoint Diagrams
You can adjust the token limit in the PowerPoint diagram prompt used by node. I’ve experimented with limiting tokens myself as below, but sometimes it results in poor in quality.
I’m still working on finding the ideal balance between providing sufficient detail and keeping the content manageable.
I’ll give HelperHat a try with Chat-Ideas too.
Then I’m thinking of generate a website with HelperHat embedded.
It should be fun!
I’ll do my best to test your app as well.
Thanks again!
It was definitely interesting to see the middle steps for sure. However it kept auto scrolling to the bottom so quickly that it felt overwhelming. I feel like many AI chat apps are doing something similar but long messages scrolled down fully makes users have to scroll up to the beginning of the message each time. It’s a delicate balance.
As for the word formatting, what I figured out was that I should specifically ask the LLM (I used a different one for a different app) for markdown format and then convert to HTML then to word, it works best somehow. It’s a hack. In my case, I used NodeJS libraries: marked, JSDOM, html2docx
I meant that I faced the exact same issue of generating word files where titles became like that.
Oh I see you’ve already tried html2docx.
I’m currently using Java code to generate Word files.
I tried html2docx before but there were many edge cases when converting markdown to HTML and then to a Word file.
I think there should be a smoother way to use them, but I haven’t looked deeply into it yet.
I’ll give it another shot.
Thanks for the advice.
The idea is cool but it really sucks because there are way too many bugs… I wish it worked though.
For example, with almost any prompt I get tons of multiple garbage messages back
Imgur: The magic of the Internet
Also, there’s no point in having a menu to enable video, etc, if a user can just prompt for it and get the file anyway.
The site is bugged because I can access and use it for free, but if I open the login page I can’t use it for free anymore, because the site keeps redirecting me to /login even when I erase it and press enter.
Images etc, should be shown on the browser, with an optional download button, not automatically downloaded.
When generating videos I get 30+ .mp3 and 30+ image files, just random things no one needs to see or cares about, until I see the .mkv…
1.About the Middle Step Message
It’s not just random things. It’s components that forms your final answer.
As I mentioned above, I thought it would be fun to watch the progress while the result is generated. But if it bothers you, I’ll consider hiding it.
2.About the video button
If you specifically want to generate a video use that button.
For autopilot, you have to direct chat-ideas to generate video file.
However if you turn on video button, you can generate video without directing.
Also you can customize a lot of option with choosing prompt.
I’ll help you with go through this if there’s any specific needs.
3.About Login redirection
This isn’t a bug.
I intended for the first visit to use a guest login.
After that login is required.
If you really do not want login, you can clear your cache to be recognized as a new user.
I’m sorry if you had a bad experience.
This is what worked for me. Since LLMs are so good at generating markdown content, I came to this conclusion for that product that I should generate markdown first then convert to HTML then to docx.
const { marked } = require('marked');
const { JSDOM } = require("jsdom");
async function markdownToDocument(markdown) {
const html = marked(markdown);
const dom = new JSDOM(html);
// Create a Document object from the HTML string
const document = dom.window.document;
// Dynamically import the HTML2docx function
const { html2docx } = await import('@adobe/helix-importer');
// Configuration for HTML2docx
const config = {
createDocumentFromString: htmlString => new JSDOM(htmlString).window.document,
setBackgroundImagesFromCSS: false,
};
// Using HTML2docx to convert HTML document to DOCX
const docxBuffer = await html2docx('http://accountail.com', document, '', {
createDocumentFromString: htmlString => new JSDOM(htmlString).window.document,
setBackgroundImagesFromCSS: false,
}, {});
return docxBuffer;
}
module.exports = {
markdownToDocument,
};
I’ll definitely give this a try!
I think this could allow for a more flexible format.