Apple researchers have released a new AI model that allows users to describe in plain language what they want to change in a photo without the need for photo editing software.
The MGIE model, which Apple worked on with the University of California, Santa Barbara, allows users to crop, resize, flip and add filters to images using textual prompts.
MGIE, which stands for MLLM-Guided Image Editing, can be used for both simple and more complex image editing tasks, such as changing certain objects in a photo to give them a different shape or make them more vivid. The model combines two different applications of multimodal language models. First, it learns to interpret the user’s cues. Then it “imagines” what the edit will look like (for example, a request to make the sky in a photo bluer becomes an increase in brightness on the sky portion of the image).
When editing a photo using MGIE, the user just needs to type what they want to change in the image. The article gives an example of editing an image of a pepperoni pizza. By typing the query “make it healthier”, vegetable toppings can be added. A photo of tigers in the Sahara looks dark, but after telling the model to “add more contrast to simulate more light”, the image becomes brighter.
Apple has made MGIE available on GitHub and also released a web demo on Hugging Face Spaces. The company has not said what plans it has for the model beyond research.
Some image generation platforms, such as OpenAI’s DALL-E 3, can perform simple editing tasks on photos they create with text input. Photoshop creator Adobe, the company most people turn to for image editing, also has its own artificial intelligence editing model. Its Firefly artificial intelligence model provides generative fill that adds generated backgrounds to photos.
Apple hasn’t been a major player in generative AI, unlike Microsoft, Meta or Google, but Apple CEO Tim Cook has said the company wants to add more AI features to its devices this year. Apple researchers had already released an open-source machine learning framework called MLX in December to make it easier to train AI models on Apple chips.
Mobile App Development Trends – 29.02
Working with dates and Codable, Fun with shapes in Compose, The missing guide to deep linking and more!
The open source StarCoder 2 model runs on regular GPUs
Companies are making more and more artificial intelligence-based code generators at an astonishing rate – services like GitHub Copilot and...
Google TV updates the homescreen
You may notice your apps taking a different shape on your Google TV For You screen.
Decompose – Kotlin Multiplatform lifecycle-aware business logic components with routing and pluggable UI
Decompose is a Kotlin Multiplatform library for breaking down your code into tree-structured lifecycle-aware business logic components (aka BLoC), with...
Mobile App Development Trends – 28.02
How to use VariadicView, Comprehensive Guide To Kotlin Context Receiver, OWASP Mobile Top 10 and more!
GitHub opens access to Copilot Enterprise
Copilot Enterprise includes all the features of the existing Business plan, including intellectual property indemnification, but extends it with a...