Skip to main content

How to View Sync Results

After a sync job completes, the connector stores video metadata, transcripts, and checksums in a structured format. This guide shows you how to access and use this data.

See it in action

Watch how to view transcript display with timestamps and download transcripts from imported videos.

Video demonstrates:

  • Opening transcript modal from video gallery
  • Viewing formatted transcript with timestamps
  • Downloading transcript as .txt file
  • Verifying transcript content

QA test recording by Hamza

What You'll Learn​

  • How to access stored metadata and transcripts
  • Understanding the JSON metadata format
  • Viewing transcript files
  • Interpreting checksums and sync history
  • Exporting results for use in CustomGPT.ai or other tools

Prerequisites​

Viewing Results in the Dashboard​

Summary View​

When a sync job completes, the Progress Card displays a high-level summary:

Added: 5
Updated: 2
Deleted: 1
Unchanged: 92

Completed in 87 seconds

What This Tells You:

  • Total Videos Processed: Added + Updated + Deleted + Unchanged = 100 videos
  • New Content: 5 new videos were imported
  • Changed Content: 2 existing videos had content changes
  • Removed Content: 1 video was removed from the collection
  • Unchanged Content: 92 videos were identical to previous imports

Detailed View​

Click the "View Details" link (below the results summary) to expand detailed results:

Expanded View Shows:

  • List of Added Videos: Video titles, URLs, and whether they have transcripts
  • List of Updated Videos: What changed (e.g., "Transcript added", "Description updated")
  • List of Deleted Videos: Which videos were removed
  • Transcript Coverage: "3 of 5 videos have transcripts (60%)"

Example Expanded View:

Added (5 videos):
βœ“ Introduction to Product Features (has transcript)
βœ“ Q3 2024 Update Walkthrough (has transcript)
βœ— Behind the Scenes Footage (no transcript)
βœ“ Customer Success Story (has transcript)
βœ— Team Q&A Session (no transcript)

Transcript Coverage: 3 of 5 videos (60%)

Accessing Stored Data​

The connector stores synced data in three locations:

1. Video Metadata (JSON Files)​

Location: /data/videos/{video_id}.json

Format: JSON with structured video metadata

Access Methods:

Method 1: Via REST API

curl https://vimeo.trustedgpt.io/api/v1/videos/{video_id}

Method 2: Direct File Access (If Self-Hosting)

ls /data/videos/
cat /data/videos/987654321.json

Example Metadata JSON:

{
"video_id": "987654321",
"title": "Introduction to Product Features",
"description": "A comprehensive walkthrough of our latest product features.",
"duration": 1245,
"upload_date": "2024-01-15T10:30:00Z",
"url": "https://vimeo.com/987654321",
"thumbnail_url": "https://i.vimeocdn.com/video/987654321_640.jpg",
"tags": ["product", "tutorial", "features"],
"privacy": "public",
"has_transcript": true,
"transcript_file": "/data/transcripts/987654321.txt",
"checksum": "a1b2c3d4e5f6...",
"synced_at": "2025-01-17T14:22:10Z"
}

Key Fields:

  • video_id β€” Unique Vimeo video ID
  • title β€” Video title
  • description β€” Video description (may be empty)
  • duration β€” Video length in seconds
  • url β€” Direct link to the video on Vimeo
  • thumbnail_url β€” URL to video thumbnail image
  • tags β€” Array of tags (keywords) assigned to the video
  • has_transcript β€” true if a native transcript was found, false otherwise
  • transcript_file β€” Path to the transcript text file (if has_transcript is true)
  • checksum β€” SHA256 hash of video content (used for incremental sync detection)

2. Transcript Files (Plain Text)​

Location: /data/transcripts/{video_id}.txt

Format: Plain text, one line per paragraph/sentence, with timing information stripped

Access Methods:

Method 1: Via REST API

curl https://vimeo.trustedgpt.io/api/v1/transcripts/{video_id}

Method 2: Direct File Access (If Self-Hosting)

cat /data/transcripts/987654321.txt

Example Transcript:

Welcome to this comprehensive product walkthrough.

Today we're going to explore the latest features we've released in Q3 2024.

First, let's take a look at the new dashboard interface.

The dashboard has been redesigned to provide better visibility into your data.

You'll notice the sidebar now includes quick access to your most-used tools.

...

Processing Details:

  • VTT Files: Timing codes (00:01:23.456 --> 00:01:25.789) are stripped, leaving only text
  • SRT Files: Numbering (1, 2, 3) and timing codes are stripped
  • Formatting: Speaker labels and formatting cues (e.g., [MUSIC]) are preserved if present in the source
  • Line Breaks: Preserved to maintain readability (paragraphs/sentences separated by blank lines)

3. Checksums (For Incremental Sync)​

Location: /data/checksums/{video_id}.hash

Format: SHA256 hash of video content

Purpose: Used by Incremental Sync to detect which videos have changed

Access Methods:

Method 1: Via REST API (Included in Video Metadata Response)

curl https://vimeo.trustedgpt.io/api/v1/videos/{video_id} | jq '.checksum'

Method 2: Direct File Access (If Self-Hosting)

cat /data/checksums/987654321.hash

Example Checksum:

a1b2c3d4e5f67890abcdef1234567890abcdef1234567890abcdef1234567890

How Checksums Are Generated:

checksum = SHA256(title + description + transcript_text + duration + upload_date)

If any of these fields change, the checksum changes, triggering an "Updated" status on the next Incremental Sync.

Viewing All Videos from a Sync Job​

Method 1: Using the Dashboard (Simple)​

  1. Navigate to the "Recent Jobs" sidebar
  2. Click on the completed job you want to review
  3. Click "View Details" in the expanded Progress Card
  4. Scroll through the list of videos

Advantages: No technical knowledge required

Disadvantages: Limited to viewing in the browser, no export capability

Method 2: Using the REST API (Advanced)​

Step 1: Get Job Details

curl https://vimeo.trustedgpt.io/api/v1/sync/{job_id}

Response Includes:

{
"job_id": "sync_a1b2c3d4",
"url": "https://vimeo.com/showcase/11708791",
"status": "completed",
"results": {
"added": 5,
"updated": 2,
"deleted": 1,
"unchanged": 92
},
"video_ids": [
"987654321",
"876543210",
"765432109",
...
]
}

Step 2: Fetch Metadata for Each Video

for video_id in 987654321 876543210 765432109; do
curl https://vimeo.trustedgpt.io/api/v1/videos/$video_id
done

Step 3: Download Transcripts

for video_id in 987654321 876543210 765432109; do
curl https://vimeo.trustedgpt.io/api/v1/transcripts/$video_id > ${video_id}.txt
done

Advantages: Scriptable, automatable, can export to files

Disadvantages: Requires command-line knowledge

Method 3: Bulk Export (API Endpoint)​

Endpoint: GET /api/v1/sync/{job_id}/export

Exports All Data from a Job as a Single ZIP File:

curl https://vimeo.trustedgpt.io/api/v1/sync/{job_id}/export -o sync_results.zip
unzip sync_results.zip

ZIP Contents:

sync_results/
β”œβ”€β”€ metadata/
β”‚ β”œβ”€β”€ 987654321.json
β”‚ β”œβ”€β”€ 876543210.json
β”‚ └── 765432109.json
β”œβ”€β”€ transcripts/
β”‚ β”œβ”€β”€ 987654321.txt
β”‚ β”œβ”€β”€ 876543210.txt
β”‚ └── 765432109.txt
└── summary.json

Advantages:

  • Single download for all data
  • Easy to archive or share
  • Includes summary report

Disadvantages:

  • Large showcases (500+ videos) produce large ZIP files (100+ MB)

Using Results with CustomGPT.ai​

The Vimeo Connector is designed to feed data into CustomGPT.ai's knowledge base. Here's how to use the results:

Step 1: Export Data​

Use the bulk export method (Method 3 above) to download all metadata and transcripts as a ZIP file.

Step 2: Format for CustomGPT.ai​

CustomGPT.ai accepts knowledge base entries in JSON format:

{
"entries": [
{
"id": "vimeo_987654321",
"title": "Introduction to Product Features",
"content": "Welcome to this comprehensive product walkthrough. Today we're going to explore...",
"source_url": "https://vimeo.com/987654321",
"metadata": {
"duration": "20:45",
"upload_date": "2024-01-15",
"tags": ["product", "tutorial", "features"]
}
},
...
]
}

Transformation Script (Python Example):

import json
import os

# Read exported metadata
videos = []
for filename in os.listdir("sync_results/metadata/"):
with open(f"sync_results/metadata/{filename}", "r") as f:
video = json.load(f)

# Read corresponding transcript
transcript_file = f"sync_results/transcripts/{video['video_id']}.txt"
if os.path.exists(transcript_file):
with open(transcript_file, "r") as f:
transcript = f.read()
else:
transcript = video.get("description", "No transcript available.")

# Format for CustomGPT.ai
videos.append({
"id": f"vimeo_{video['video_id']}",
"title": video["title"],
"content": transcript,
"source_url": video["url"],
"metadata": {
"duration": f"{video['duration'] // 60}:{video['duration'] % 60:02d}",
"upload_date": video["upload_date"].split("T")[0],
"tags": video.get("tags", [])
}
})

# Write CustomGPT.ai-compatible JSON
with open("customgpt_knowledge_base.json", "w") as f:
json.dump({"entries": videos}, f, indent=2)

print(f"Exported {len(videos)} videos to customgpt_knowledge_base.json")

Step 3: Upload to CustomGPT.ai​

  1. Log into your CustomGPT.ai account
  2. Navigate to Knowledge Base β†’ Import
  3. Upload the customgpt_knowledge_base.json file
  4. CustomGPT.ai will index the video content and make it searchable

Result: Users can now ask questions like:

  • "What product features were covered in Q3 2024?"
  • "Summarize the video about customer success stories"
  • "Find all videos that mention the dashboard redesign"

CustomGPT.ai's AI will search the video transcripts and provide relevant answers with source citations.

Interpreting Transcript Coverage​

After a sync, you may notice that not all videos have transcripts. Here's how to interpret and handle this:

Understanding the Coverage Metric​

Example: "Transcript Coverage: 3 of 5 videos (60%)"

What This Means:

  • 5 videos were imported
  • 3 of those videos had native Vimeo transcripts
  • 2 videos had no transcript (metadata-only import)

Why Some Videos Don't Have Transcripts​

Vimeo Doesn't Require Transcripts: Many video owners don't upload transcripts.

Auto-Generated Captions Are Not Accessible: Vimeo offers auto-generated captions on some plans, but these are only available through the player UIβ€”not via the API.

Third-Party Videos in Collections: If you import a collection (as opposed to an album), videos from other users may not have transcripts.

What to Do About Missing Transcripts​

Option 1: Accept Metadata-Only Imports

  • Video metadata (title, description, tags) is still valuable for search and organization
  • Import the videos as-is and use metadata for discovery

Option 2: Request Transcripts from Video Owner

  • If you have contact with the video owner, ask them to upload transcripts to Vimeo
  • Once uploaded, re-run a Full or Incremental Sync to fetch the new transcripts

Option 3: Generate Transcripts Externally

  • Use OpenAI Whisper, Rev.com, or similar services to generate transcripts
  • Upload the transcripts to Vimeo manually
  • Re-sync to fetch them

Note: The connector does not generate transcripts from audio. It only fetches native Vimeo transcripts.

Exporting for Other Use Cases​

Beyond CustomGPT.ai, you can export results for various purposes:

Use Case 1: Generating a Video Index​

Export all video titles and URLs to create a searchable index:

curl https://vimeo.trustedgpt.io/api/v1/sync/{job_id}/export | jq '.videos[] | {title, url}'

Output:

{"title": "Introduction to Product Features", "url": "https://vimeo.com/987654321"}
{"title": "Q3 2024 Update Walkthrough", "url": "https://vimeo.com/876543210"}
...

Use Case 2: Transcript Analysis​

Export all transcripts to analyze word frequency, topics, or sentiment:

import os
from collections import Counter

# Read all transcripts
all_text = ""
for filename in os.listdir("sync_results/transcripts/"):
with open(f"sync_results/transcripts/{filename}", "r") as f:
all_text += f.read() + " "

# Word frequency analysis
words = all_text.lower().split()
common_words = Counter(words).most_common(20)

print("Most common words:")
for word, count in common_words:
print(f"{word}: {count}")

Use Case 3: Creating a Transcript Corpus for Training​

Export transcripts as a training dataset for machine learning models:

cat sync_results/transcripts/*.txt > training_corpus.txt

Best Practices​

Archive Sync Results​

Why: Results may not be stored indefinitely on the backend (default retention: 30 days)

How:

  1. Download the bulk export ZIP after each important sync
  2. Store the ZIP in a safe location (cloud storage, local backup)
  3. Include the sync date in the filename: sync_results_2025-01-17.zip

Review Transcript Coverage Before Ingesting into CustomGPT.ai​

Why: Videos without transcripts provide limited value for AI-powered search

How:

  1. Check the "Transcript Coverage" metric after each sync
  2. If coverage is less than 30%, consider contacting video owners to request transcripts
  3. Decide whether to include metadata-only videos in your knowledge base

Use Incremental Sync for Periodic Updates​

Why: Avoids re-downloading unchanged videos, saving time and API usage

How:

  1. Do an initial Full Sync to import all content
  2. Schedule Incremental Syncs daily/weekly to fetch new or updated videos
  3. Only run Full Sync periodically (monthly) to ensure nothing was missed

Validate Data Integrity​

Why: Ensure downloads completed successfully and files are not corrupted

How:

  1. Check that JSON files are valid: jq . sync_results/metadata/*.json
  2. Verify transcript files are not empty: wc -l sync_results/transcripts/*.txt
  3. Compare checksum counts with video counts: ls sync_results/checksums/ | wc -l

Monitor Sync Jobs:

Optimize Sync Speed:

Handle Issues:

Understand the System: