I want a progressive web app that transcribes my voice using Whisper API and lets me copy the transcription to the clipboard.
The app should be very simple - just (from top to bottom) a logo.png, big red record/stop button, a text field (where transcription appears) and a big `copy to clipboard` button.
You click to record, click again to stop, it's sent to Whisper API and then transcribed, then appears in text field to copy.
The design should be done very much with phones as in mind as that's where I will be using it.
It would be a nice effect if when recording the button changes color and throbs.
It would also be a nice effect if the background was a light gradient - from very light purple to very light pink.
Style could maybe have some neon colors.
This can be a ONE PAGE APP. We do NOT need to save to firebase so NO NEED for firebase config.
This is a mobile first app so will definitely need manifest.json, even if we cant send to whisper api when offline.
# Voice 2 Clip
A mobile-first Progressive Web App that transcribes voice recordings using OpenAI's Whisper API, featuring a simple, intuitive interface for recording and copying transcriptions.
## Project Overview
Voice 2 Clip is designed to be a straightforward, user-friendly voice transcription tool. With a minimalist interface featuring a pulsing record button, instant transcription display, and easy clipboard copying, it's perfect for quick voice-to-text needs on mobile devices.
Key features:
- One-tap voice recording
- Real-time recording status indication
- OpenAI Whisper API integration for accurate transcription
- Instant copy-to-clipboard functionality
- Mobile-first responsive design
- PWA capabilities for offline access
## Features & Components
### Main Components
- Record Button: Animated, pulsing interface for voice recording
- Transcription Field: Clean, readable display of transcribed text
- Copy Button: Quick clipboard access for transcribed content
- Status Indicator: Visual feedback for recording and processing states
### Pages
- TranscriptionPage: Main recording interface
- TranscriptionResultsPage: Displays completed transcriptions
## Prerequisites & Installation
1. Clone the repository:
```bash
git clone https://github.com/yourusername/voice2clip.git
cd voice2clip
```
2. Install dependencies:
```bash
npm install
npm install react-router-dom
```
3. Create a `.env` file in the root directory:
```
REACT_APP_OPENAI_API_KEY=your_openai_api_key
```
## Running the App
Start the development server:
```bash
npm start
```
Access the app at `http://localhost:3000`
## Deployment Notes
Build for production:
```bash
npm run build
```
The app includes PWA support through:
- `manifest.json` for app installation
- Service Worker for offline functionality
- Mobile-optimized viewport and styling
## Environment Variables
Required environment variables:
- `REACT_APP_OPENAI_API_KEY`: Your OpenAI API key for Whisper API access
Create React App automatically loads these from the `.env` file.
## API Integration
The app uses OpenAI's Whisper API for voice transcription:
- Implementation in `src/services/whisperApi.js`
- Handles audio file conversion and API communication
- Returns transcribed text in JSON format
## Technical Requirements
- Node.js 14.0.0 or later
- Modern web browser with MediaRecorder API support
- Internet connection for API functionality
- OpenAI API key for transcription services
## Browser Support
Tested and supported on:
- Chrome (latest)
- Firefox (latest)
- Safari (latest)
- Chrome for Android
- Safari iOS
## Styling
The app features:
- Gradient background (light purple to light pink)
- Neon accent colors
- Responsive design for mobile devices
- Animated recording button
- Clean, modern typography using Montserrat font
## Contributing
Contributions are welcome! Please read the contributing guidelines and submit pull requests for any enhancements.
## License
MIT License - see LICENSE file for details
SESSION 9 - DEBUG SUMMARY
Here's a comprehensive summary of the Voice 2 Clip app:
1. WHAT THE APP DOES
Voice 2 Clip is a Progressive Web App (PWA) designed primarily for mobile use that provides real-time voice-to-text transcription using OpenAI's Whisper API. Users can record audio through their device's microphone, which is then automatically transcribed to text. The transcribed text appears in a text field and can be copied to the clipboard with a single tap.
The app features a minimalist interface with a gradient background (light purple to light pink), centered around a pulsating record button that changes state during recording. The UI is optimized for mobile devices and includes visual feedback mechanisms like button animations and status indicators.
From a technical perspective, the app uses React.js for the frontend, implements the MediaRecorder API for audio capture, and integrates with OpenAI's Whisper API for transcription. It's structured as a PWA with appropriate manifest settings, though offline functionality is limited due to the requirement for API access.
2. FILE STRUCTURE AND DEPENDENCIES
Core Files (Most Critical):
- src/App.js - Main application container that handles routing and overall app structure. Changes here affect the entire application flow.
- src/services/whisperApi.js - Manages all communication with OpenAI's Whisper API. Critical for core transcription functionality.
- src/components/AudioRecorder.js - Handles all audio recording logic using MediaRecorder API. Central to the app's main functionality.
Primary Component Files:
- src/pages/TranscriptionPage.js - Main page component that orchestrates recording, transcription, and user interaction.
- src/components/RecordButton.js & .css - Controls recording state and provides visual feedback.
- src/components/TranscriptionField.js & .css - Displays transcribed text and handles text display formatting.
- src/components/CopyButton.js & .css - Manages clipboard functionality and button styling.
- src/components/StatusIndicator.js & .css - Provides user feedback about current app state.
Styling Files:
- src/App.css - Contains global styles and theme definitions.
- Individual component CSS files - Contain component-specific styles.
3. FRONTEND STRUCTURE
Main Page (TranscriptionPage):
- Features logo, record button, transcription field, and copy button
- Handles recording state management and API communication
- Files: TranscriptionPage.js, TranscriptionPage.css
Components:
- Record Button Section: Pulsating button for recording control
- Transcription Field: Displays transcribed text
- Copy Button: Enables clipboard functionality
- Status Indicator: Shows current app state
- Related files: All components in src/components/
4. RECOMMENDED INITIAL FILES FOR EDITING
When starting a new editing thread, I recommend providing these files initially:
1. src/App.js
2. src/services/whisperApi.js
3. src/components/AudioRecorder.js
4. src/pages/TranscriptionPage.js
5. src/App.css
These files provide the core functionality and structure of the app. Having access to them immediately will allow for better understanding of any proposed changes and their potential impacts.
This summary should be sufficient to restart our conversation about the app in a new thread and make modifications efficiently.