Last updated: July 2, 2024
AI is as hot right now as the web was in the Dot com boom in the 90s, when buzzwords such as e-commerce and SEO were hot off the press.. Keeping up with advancements of AI right now is like drinking from the proverbial firehose.
Over the past decade we’ve seen some major trends in tech, and the job market has reflected these trends. As storage has increased and costs have come down, accompanied by advancements in processing power, we’ve unlocked the ability to work with massive amounts of data. First we had Big Data, then Machine Learning and now we’re hot for Artificial Intelligence. GPTs (Generative pre-trained transformers) are neural networks and a type of large language models (LLMs) leveraging natural language processing to power Generative Artificial Intelligence (GenAI). They are trained on large data sets and designed to evaluate input and generate output (text, images, audio, video, 3d, etc) from prompts and a variety of other inputs. GenAI, while controversial in some areas, is a transformative productivity tool, unlocking the ability to prototype and create content in lightning speed. These AI tools can be used to build chatbots, summarize long-form text, provide academic support, generate social media content, stylize content, data analysis, image analysis, image creation, animating images, creating memes, generate quizzes, coding assistance /websites, blog posts, travel plans and more. While GenAI can outperform human speed capabilities, it still requires human intervention from crafting the perfect prompt to final edits. It’s not perfect. I quote a speaker from AWE XR ’24 “AI is not generative, it is derivative”. It still requires human creativity, human input, and most importantly, human context. It is great for prototyping and quickly analyzing information.
GenAI Overview
Whether you are looking for a model or an application that leverages LLMs, it helps to understand the various tools available in the AI Tech Stack. Here’s a good source for finding some tools for your use case: TopAI Tools . I also hopped over to eWeek for news and updates and found a ranking of the top 150 AI tools.
Play with AI models
You can access most of the models below natively from their organization or in APIs, or you can use a number of cloud solutions providers such as AWS Bedrock, Google Cloud, Microsoft, etc.
- Kaggle for over 350k public data sets and over 1M public notebooks.
- Hugging Face
AI/Model | Features/Costs | NOTES |
---|---|---|
Chat GPT | New in ChatGPT GPT-4o Personal or enterprise Desktop and mobile apps can hear, see and speak Pricing: $19.99 / month Free tier (3.5) | You can now show ChatGPT one or more images. Troubleshoot why your grill won’t start, explore the contents of your fridge to plan a meal, or analyze a complex graph for work-related data. To focus on a specific part of the image, you can use the drawing tool in the mobile app. |
DALL-E 3 | image from text | |
Sora | video from text |
AI/Model | Features/Costs | NOTES |
---|---|---|
Claude | New in Claude * Convert UI design to front end code * Extract text from Images * Transcribe handwritten notes API access • API integration; independently interact with APIs and the web; Developers can setup a toolbox. Pricing depends on how much text you work with, measured in tokens (1,000 tokens are about 750 words) | Check out the GPT Store Update: 6/8 https://www.youtube.com/watch?v=oGFh62u-5dY • for general use; natural language processing, Claude figures out what it needs from the web, tells the API what it needs Accessible through: • Anthropic Messages API • Amazon Bedrock • Google Vertex AI |
Claude 3.5 Sonnet | balance skill and speed; efficient; AI interact with people $3.00 for every million tokens | |
Claude Opus | most advanced; complex tasks; deep thinking $15.00 for every million tokens | |
Claude Haiku | fastest and compact model; rapid response, efficient resource utilization $0.25 for every million tokens |
Meta
AI/Model | Features/Costs | Notes |
---|---|---|
Llama 3 > Meta AI Assistant | Model details Card Text Input/output | Available through Hugging Face or Kaggle |
AI/Model | Features/Costs |
---|---|
Gemini > Chat with Gemini |
AI/Model | Features/Costs |
---|---|
Mistral 8x22B chat console | Open Source Mixtral 8x22B is currently the most performant open model. A 22B sparse Mixture-of-Experts (SMoE). Uses only 39B active parameters out of 141B. Fluent in English, French, Italian, German, Spanish, and strong in code 64k context window Native function calling capacities Function calling and json mode available on our API endpoint $2 /1M tokens $6 /1M tokens |
Mistral 8x7B | A 7B sparse Mixture-of-Experts (SMoE). Uses 12.9B active parameters out of 45B total. Fluent in English, French, Italian, German, Spanish, and strong in code 32k context window input: $0.7 /1M tokens output: $0.7 /1M tokens |
Mistral 7B | A 7B transformer model, fast-deployed and easily customisable. Small, yet very powerful for a variety of use cases. Performant in English and code 32k context window input: $0.25 /1M tokens output: $0.25 /1M tokens |
mistral-small-2402 codestral-2405 mistral-large-2402 | Mistral also has several optimized models. See their pricing page for details |
Multimedia
Organization | AI/Model | Features/Costs | Notes |
---|---|---|---|
Stability.ai | Stable Diffusion 3 Stable Assistant, Stable Artisan Stable Audio | Image Generation from text Video Generation from text Music Generation from text of audio samples 3d Models Multilingual Language Models Pricing is per credit. Credits are priced at $10 per 1,000 credits, which is enough credits for roughly 5,000 SDXL 1.0 images. First 25 credits are free. | Self-hosted, developer platform, cloud hosted AWS, Google Cloud, NVidia, Intel Developer Cloud |
Spline | 3D object and app generation. iOs, iPad, Mac, Apple Vision Pro |
Cohere https://cohere.com/
3D from Text
https://spline.design/ai-generate
NVideo get3d and 3d tools
Midjourney
Computer Vision
Organization | Model | Features/Costs | Notes |
---|---|---|---|
Synthesia | Computer Vision – Security, Identity verification, AR/VR/XR, Virtual Try on, Driver Monitoring, pedestrian detection |
Image/Video
Organization | Model | Features/Costs | Notes |
---|---|---|---|
Adobe | Adobe Creative Suite | F | |
Luma | Luma Dream Machine | ||
Morph Studio | |||
Runway | Gen 3 Alpha, Gen 2, Gen 1 video to video | text to image, image to image, frame interpolation, upscale image, video to video | |
Topaz | photo and video editing | ||
Audio
Avatars
Generative AI video platform with AI avatars, text to video
Animation
Organization | Model | Features/Costs | Notes |
---|---|---|---|
Kaiber | | | |
| | |
Marketing & Creator Tools
FBRC.ai – storybuildlng
Fabric.space
https://www.snackshop.app – TikTok for Graphic Novels