Files
rspade_system/app/RSpade/man/file_upload.txt
root 77b4d10af8 Refactor filename naming system and apply convention-based renames
Standardize settings file naming and relocate documentation files
Fix code quality violations from rsx:check
Reorganize user_management directory into logical subdirectories
Move Quill Bundle to core and align with Tom Select pattern
Simplify Site Settings page to focus on core site information
Complete Phase 5: Multi-tenant authentication with login flow and site selection
Add route query parameter rule and synchronize filename validation logic
Fix critical bug in UpdateNpmCommand causing missing JavaScript stubs
Implement filename convention rule and resolve VS Code auto-rename conflict
Implement js-sanitizer RPC server to eliminate 900+ Node.js process spawns
Implement RPC server architecture for JavaScript parsing
WIP: Add RPC server infrastructure for JS parsing (partial implementation)
Update jqhtml terminology from destroy to stop, fix datagrid DOM preservation
Add JQHTML-CLASS-01 rule and fix redundant class names
Improve code quality rules and resolve violations
Remove legacy fatal error format in favor of unified 'fatal' error type
Filter internal keys from window.rsxapp output
Update button styling and comprehensive form/modal documentation
Add conditional fly-in animation for modals
Fix non-deterministic bundle compilation

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-13 19:10:02 +00:00

1436 lines
52 KiB
Plaintext
Executable File

NAME
File Upload System - Physical storage and logical attachment models
SYNOPSIS
File upload system with deduplication, hash collision handling, and
polymorphic attachments.
DESCRIPTION
The RSpade file upload system uses a two-model architecture to separate
physical file storage from logical file metadata. This enables efficient
deduplication while maintaining rich metadata for each upload.
MODELS
File_Storage_Model (Framework Model)
Location: /system/app/Models/File_Storage_Model.php
Table: file_storage
Purpose: Represents unique physical files on disk
Fields:
id Primary key
hash SHA-256 hash (or incremented variant for collisions)
size File size in bytes
created_at Timestamp
updated_at Timestamp
created_by User ID who first uploaded this physical file
updated_by User ID who last referenced this physical file
Multiple File_Attachment_Model records can reference the same
File_Storage_Model if they have identical file content (deduplication).
File_Attachment_Model (User Model)
Location: /rsx/models/File_Attachment_Model.php
Table: file_attachments
Purpose: Represents a logical file upload with metadata
Fields:
id Primary key
key 64-char cryptographically secure random hex
file_storage_id Foreign key to file_storage table
file_name Original filename
file_extension File extension
file_type_id Enum (1=image, 2=animated_image, 3=video,
4=archive, 5=text, 6=document, 7=other)
fileable_type Polymorphic: parent model class name
fileable_id Polymorphic: parent model ID
fileable_category Optional category for grouping files
fileable_type_meta Indexed VARCHAR for common searchable metadata
fileable_order Optional sort order
fileable_meta JSON blob for additional unstructured metadata
site_id Multi-tenant site ID
session_id Session ID that uploaded this file (security)
created_at Upload timestamp
updated_at Last modified timestamp
created_by User ID who uploaded
updated_by User ID who last modified
Relationships:
- file_storage() - BelongsTo File_Storage_Model
- fileable() - MorphTo any model
- site() - BelongsTo Site_Model
STORAGE STRUCTURE
Physical File Location:
storage/files/{dir1}/{dir2}/{hash}
Where:
{dir1} = First 2 characters of hash
{dir2} = Next 2 characters of hash
{hash} = Full SHA-256 hash or incremented variant
Example:
Hash: abc123def456789...
Path: /var/www/html/system/storage/files/ab/c1/abc123def456789...
Directory Structure Benefits:
- Prevents too many files in single directory
- Improves filesystem performance
- Enables efficient file distribution across storage
HASH COLLISION HANDLING
When a file is uploaded, the system uses a sophisticated collision
detection algorithm:
Algorithm:
1. Calculate SHA-256 hash of uploaded file
2. Check if file exists on disk at hash path
3. If exists:
- Perform byte-by-byte comparison
- If match: Reuse existing File_Storage_Model record
- If no match (hash collision):
* Increment hash as base-16 number
* Repeat from step 2
4. If doesn't exist:
- Save file to disk
- Create new File_Storage_Model record
Hash Increment:
The entire SHA-256 hash is treated as a base-16 number and
incremented by 1. This provides effectively unlimited collision
slots while maintaining hash-like distribution.
Example:
abc123 → abc124
abc1ff → abc200
ffffff → 1000000
Implementation:
File_Storage_Model::increment_base16_value($hash)
File_Storage_Model::find_or_create($temp_file_path)
File_Storage_Model::files_match($file1, $file2)
DEDUPLICATION
Multiple users can upload identical files without duplicating physical
storage:
User A uploads logo.png (hash: abc123...)
→ Creates File_Storage_Model (id: 1, hash: abc123...)
→ Creates File_Attachment_Model (storage_id: 1, name: logo.png)
User B uploads company-logo.png (identical content)
→ Reuses File_Storage_Model (id: 1)
→ Creates File_Attachment_Model (storage_id: 1, name: company-logo.png)
Only one physical file exists on disk, but each user has their own
logical attachment record with their chosen filename and metadata.
FILE LIFECYCLE
Creation:
ALL file uploads or file creation operations MUST create a
File_Attachment_Model record. This is the only supported way
to create files in the system.
Orphaned Files:
Files not associated to a model via polymorphic relationship
(fileable_type/fileable_id are NULL) within 24 hours may be
deleted by cleanup operations.
Storage Cleanup:
File_Storage_Model records with no associated File_Attachment_Model
records are automatically deleted along with their physical files.
Implementation: File_Attachment_Model::boot() deleted event handler
checks if storage has no remaining references and deletes it
automatically. This ensures storage is cleaned up immediately when
the last attachment is deleted, preventing orphaned files.
ATTACHMENT API
The framework provides a clean, secure API for attaching files to models.
This is the STANDARD APPROACH for all file attachments going forward.
Security Model:
Files are uploaded UNATTACHED (no fileable_* fields set) and assigned
later using attach_to() or add_to(). This ensures:
- Users can only assign files they uploaded (session validation)
- Files belong to the correct site (multi-tenant isolation)
- No pre-assignment to records the user doesn't own
Attachment Methods (on File_Attachment_Model):
attach_to($model, $category)
Attach file to model, REPLACING any existing attachment in category.
Use for single-file attachments (profile photos, logos, etc).
Example:
$attachment = File_Attachment_Model::find_by_key($key);
if ($attachment->can_user_assign_this_file()) {
$attachment->attach_to($user, 'profile_photo');
}
Behavior:
1. Validates can_user_assign_this_file() - throws exception if fails
2. Detaches any existing attachments with same category
3. Assigns this attachment to the model
add_to($model, $category)
Add file to model WITHOUT replacing existing attachments in category.
Use for multi-file attachments (documents, photos, etc).
Example:
$attachment = File_Attachment_Model::find_by_key($key);
if ($attachment->can_user_assign_this_file()) {
$attachment->add_to($project, 'documents');
}
Behavior:
1. Validates can_user_assign_this_file() - throws exception if fails
2. Assigns this attachment alongside existing ones
detach()
Detach file from its current assignment without deleting it.
Example:
$attachment->detach();
Warning: Detached files may be cleaned up by periodic maintenance
can_user_assign_this_file()
Validates whether the current user can assign this attachment.
Checks:
1. File not already assigned (fileable_type/id are NULL)
2. Same site_id as current site
3. Same session_id as current session
Returns: bool
Example:
if ($attachment->can_user_assign_this_file()) {
$attachment->attach_to($user, 'profile_photo');
} else {
return ['success' => false, 'error' => 'Cannot assign file'];
}
Helper Methods (on all models via Rsx_Model_Abstract):
get_attachment($category)
Get single attachment by category. Use for single-file fields.
Example:
$profile_photo = $user->get_attachment('profile_photo');
if ($profile_photo) {
echo $profile_photo->get_thumbnail_url('cover', 128, 128);
}
get_attachments($category)
Get all attachments by category. Use for multi-file fields.
Example:
$documents = $project->get_attachments('documents');
foreach ($documents as $doc) {
echo $doc->file_name;
}
Complete Workflow Example:
// Frontend: Upload file
$.ajax({
url: '/_upload',
type: 'POST',
data: formData,
success: function(response) {
// Store key for form submission
$('#profile_photo').val(response.attachment.key);
}
});
// Backend: Assign file on form save
#[Ajax_Endpoint]
public static function save_profile(Request $request, array $params = []) {
$user = Session::get_user();
if (!empty($params['profile_photo'])) {
$attachment = File_Attachment_Model::find_by_key($params['profile_photo']);
if ($attachment && $attachment->can_user_assign_this_file()) {
$attachment->attach_to($user, 'profile_photo');
}
}
return ['success' => true];
}
// Display: Show attachment
@php
$profile_photo = $user->get_attachment('profile_photo');
@endphp
@if($profile_photo)
<img src="{{ $profile_photo->get_thumbnail_url('cover', 128, 128) }}">
@else
<div>No photo</div>
@endif
POLYMORPHIC ATTACHMENTS (Legacy)
NOTE: Direct manipulation of fileable_* fields is considered LEGACY.
Use the Attachment API (attach_to/add_to) instead.
For programmatic/internal use, files can still be attached using
direct field assignment:
// Internal/programmatic attachment (bypasses security checks)
$attachment->fileable_type = 'User_Model';
$attachment->fileable_id = $user->id;
$attachment->fileable_category = 'system_generated';
$attachment->save();
Querying Attached Files:
File_Attachment_Model::forModel('Project_Model', $id)->get()
File_Attachment_Model::inCategory('documents')->get()
ATTACHMENT METADATA
Three levels of metadata provide flexibility for different use cases:
fileable_category (VARCHAR, nullable)
Broad category for grouping files
Examples: 'avatar', 'background', 'document', 'image', 'video'
fileable_type_meta (VARCHAR, indexed, nullable)
Specific type within category - indexed for efficient queries
Examples: 'profile', 'cover', 'header', 'thumbnail', 'original'
fileable_meta (TEXT/JSON, nullable)
Unstructured JSON blob for additional metadata
Examples: {"variant":"winter", "size":"large", "crop":"center"}
Usage Examples:
// User has profile photo and cover photo
$profile = new File_Attachment_Model();
$profile->fileable_type = 'User_Model';
$profile->fileable_id = $user->id;
$profile->fileable_category = 'image';
$profile->fileable_type_meta = 'profile';
$profile->save();
$cover = new File_Attachment_Model();
$cover->fileable_type = 'User_Model';
$cover->fileable_id = $user->id;
$cover->fileable_category = 'image';
$cover->fileable_type_meta = 'cover';
$cover->save();
// Query by type_meta
$profile = File_Attachment_Model::forModel('User_Model', $user->id)
->byTypeMeta('profile')
->first();
// Using JSON metadata
$attachment->set_meta([
'variant' => 'winter',
'processed' => true,
'filters' => ['brightness', 'contrast']
]);
$attachment->save();
$meta = $attachment->get_meta();
// ['variant' => 'winter', 'processed' => true, ...]
FILE ACCESS
Each File_Attachment_Model has a unique 64-character hex key for
secure access:
URL Generation:
$attachment->get_url() // View/inline
$attachment->get_download_url() // Force download
Example URLs:
/files/abc123def456... // Inline viewing
/files/abc123def456...?download=1 // Download
The key is generated using cryptographically secure random bytes:
File_Attachment_Model::generate_key()
FILE SIZE
File size is stored ONLY in File_Storage_Model to avoid denormalization.
File_Attachment_Model retrieves size through the relationship:
$attachment->get_size() // Returns bytes (int)
bytes_to_human($attachment->get_size()) // e.g., "2.5 MB"
The bytes_to_human() helper function formats byte values for display.
FILE TYPE DETECTION
Files are categorized by MIME type:
File_Attachment_Model::FILE_TYPE_IMAGE (1)
File_Attachment_Model::FILE_TYPE_ANIMATED_IMAGE (2)
File_Attachment_Model::FILE_TYPE_VIDEO (3)
File_Attachment_Model::FILE_TYPE_ARCHIVE (4)
File_Attachment_Model::FILE_TYPE_TEXT (5)
File_Attachment_Model::FILE_TYPE_DOCUMENT (6)
File_Attachment_Model::FILE_TYPE_OTHER (7)
Helper Methods:
$attachment->is_image()
$attachment->is_video()
$attachment->is_document()
$attachment->file_type_id_label
HTTP ENDPOINTS
All file attachment operations are handled by a unified controller with
comprehensive security hooks and event support.
Controller: App\RSpade\Core\Files\File_Attachment_Controller
Available Endpoints:
POST /_upload - Upload file
GET /_download/:key - Download file
GET /_inline/:key - View file inline
GET /_thumbnail/:key/:type/:width/:height? - Generate thumbnail
GET /_icon_by_extension/:extension - Get file type icon
UPLOAD ENDPOINT
POST /_upload
Upload files with automatic deduplication, metadata extraction, and session-based
security tracking. Files are uploaded UNATTACHED and must be assigned to models
using the attach_to() or add_to() methods after validation.
Auth: Permission::anybody() (use event hooks for authorization)
Request Parameters:
file (required) - The uploaded file
site_id (required) - Site ID for multi-tenant isolation
filename_override (optional) - Override the uploaded filename
SECURITY: User-provided fileable_* parameters are IGNORED
For security reasons, files uploaded via the web endpoint cannot be
pre-assigned to models. This prevents users from attaching files to
records they don't own.
Workflow:
1. Upload file → Gets session_id, no fileable_* fields set
2. Submit form with attachment key
3. Backend validates can_user_assign_this_file()
4. Backend calls attach_to() or add_to() to assign file
Session Tracking:
All uploaded files automatically receive:
- session_id: Current session ID for security validation
- site_id: Multi-tenant site identifier
These fields enable can_user_assign_this_file() to verify users
can only assign files they uploaded in their current session
Response Format (JSON):
{
"success": true,
"attachment": {
"key": "hash...",
"file_name": "document.pdf",
"file_type": "Document",
"file_extension": "pdf",
"url": "http://example.com/files/hash...",
"download_url": "http://example.com/files/hash...?download=1",
"size": 1048576,
"width": null,
"height": 1920,
"is_animated": false,
"duration": null
}
}
Example Request:
curl -X POST http://localhost/_upload \
-F "file=@document.pdf" \
-F "site_id=1" \
-F "fileable_type=post" \
-F "fileable_id=42" \
-F "fileable_category=attachment"
Event Hooks:
The upload endpoint fires four events for customization:
1. file.upload.authorize (gate)
Purpose: Authorization check
Data: {request: Request, user: User|null, params: array}
Return: true to allow, or Response to deny
Example:
#[OnEvent('file.upload.authorize', priority: 10)]
public static function require_auth($data) {
if (!Session::is_logged_in()) {
return response()->json([
'success' => false,
'error' => 'Authentication required'
], 403);
}
return true;
}
2. file.upload.params (filter)
Purpose: Modify upload parameters before file is saved
Data: array of upload params
Return: modified params array
Example:
#[OnEvent('file.upload.params', priority: 10)]
public static function add_metadata($params) {
$params['fileable_type_meta'] = 'user_uploaded';
return $params;
}
3. file.upload.complete (action)
Purpose: Post-upload processing (logging, notifications)
Data: {attachment: File_Attachment_Model, request: Request, params: array}
Return: void (ignored)
Example:
#[OnEvent('file.upload.complete', priority: 10)]
public static function log_upload($data) {
Log::info("File uploaded: " . $data['attachment']->file_name);
}
4. file.upload.response (filter)
Purpose: Modify response data before sending to client
Data: {success: bool, attachment: array}
Return: modified response array
Example:
#[OnEvent('file.upload.response', priority: 10)]
public static function add_cdn_url($response) {
$response['cdn_url'] = cdn_url($response['attachment']['key']);
return $response;
}
See Also:
php artisan rsx:man event_hooks
DOWNLOAD AND INLINE VIEWING
GET /_download/:key
GET /_inline/:key
Serve uploaded files for download or inline browser display.
Both endpoints implement cascading security checks using event hooks.
URL Parameters:
key - 64-character attachment key
Headers:
Cache-Control: public, max-age=31536000 (1 year)
Security: CASCADING MODEL
Both endpoints check TWO security hooks in order:
1. file.thumbnail.authorize - Base file access permission
2. file.download.authorize - Additional download restrictions
This cascading model ensures:
- Thumbnail security automatically protects downloads
- Download restrictions don't affect thumbnail viewing
- Developers can't forget to secure downloads if they secure thumbnails
Event Hooks:
1. file.thumbnail.authorize (gate)
Purpose: Control file access (thumbnails and downloads)
Data: {attachment: File_Attachment_Model, user: User|null, request: Request}
Return: true to allow, or Response to deny
Example:
#[OnEvent('file.thumbnail.authorize', priority: 10)]
public static function check_file_access($data) {
// Only allow access to files the user owns
if ($data['attachment']->created_by !== $data['user']?->id) {
return response()->json([
'success' => false,
'error' => 'Access denied'
], 403);
}
return true;
}
2. file.download.authorize (gate)
Purpose: Additional restrictions for full file downloads
Data: {attachment: File_Attachment_Model, user: User|null, request: Request}
Return: true to allow, or Response to deny
Example:
#[OnEvent('file.download.authorize', priority: 10)]
public static function check_download_access($data) {
// Require premium subscription for downloads
if (!$data['user']?->has_premium()) {
return response()->json([
'success' => false,
'error' => 'Premium subscription required'
], 403);
}
return true;
}
Response Headers:
/_download - Content-Disposition: attachment (forces download)
/_inline - Content-Disposition: inline (displays in browser)
Example Usage:
// Link to view image in browser
<img src="/_inline/<?= $attachment->key ?>">
// Link to download file
<a href="/_download/<?= $attachment->key ?>">Download</a>
THUMBNAIL GENERATION
GET /_thumbnail/:key/:type/:width/:height?
Generate thumbnails for uploaded files with automatic fallback to file type icons.
URL Parameters:
key - 64-character attachment key
type - Thumbnail mode: 'cover' or 'fit'
width - Width in pixels (10-256)
height - Height in pixels (10-256, optional)
Thumbnail Modes:
cover - Fill entire dimensions, crop excess (like CSS background-size: cover)
fit - Maintain aspect ratio within bounds, transparent background
Automatic Icon Fallback:
For non-image files (PDF, documents, archives, etc.), the system automatically
generates icon-based thumbnails instead of returning errors:
- Small thumbnails (< 72x72): Icon scaled to fill dimensions
- Large thumbnails (≥ 72x72): 64x64 icon centered on white canvas
This provides consistent visual representation for all file types.
Supported Image Formats:
JPG, PNG, GIF, WEBP, BMP
Future Format Support:
PDF pages, PSD layers, DOCX previews, video frames
Output Format:
WebP with 85% quality (smaller file sizes, broad browser support)
Cache Headers:
Cache-Control: public, max-age=31536000 (1 year)
Security:
Checks file.thumbnail.authorize event hook only
(Downloads check both thumbnail AND download hooks)
Example Usage:
// Profile photo (96x96, cropped to square)
<img src="/_thumbnail/<?= $key ?>/cover/96/96">
// Gallery thumbnail (200x200)
<img src="/_thumbnail/<?= $key ?>/cover/200/200">
// Document preview (maintain aspect ratio)
<img src="/_thumbnail/<?= $key ?>/fit/240/180">
// PDF will show PDF icon, images show actual thumbnail
FILE TYPE ICONS
GET /_icon_by_extension/:extension
Retrieve file type icon as PNG for displaying file types visually.
URL Parameters:
extension - File extension without dot (e.g., 'pdf', 'jpg', 'stl')
Query Parameters:
width - Icon width in pixels (10-256, default: 64)
height - Icon height in pixels (10-256, default: 64)
Icon Library:
80+ file type icons from open-source libraries:
- Papirus (GPL-3.0) - Generic file type icons
- Free File Icons (MIT) - PDF, PSD, AI icons
- Lucide (ISC) - 3D model icons
- Custom - Code files with {} brackets
Icon Categories:
- Images (jpg, png, gif, svg, psd, ai)
- Videos (mp4, avi, mov, webm)
- Audio (mp3, wav, flac)
- Documents (pdf, doc, xls, ppt, txt)
- Archives (zip, rar, 7z)
- Code (php, js, py, html, css)
- 3D Models (stl, obj, fbx, blend)
Output Format:
PNG with transparent background
Cache Headers:
Cache-Control: public, max-age=86400 (24 hours)
Example Usage:
// Show PDF icon
<img src="/_icon_by_extension/pdf?width=64&height=64">
// Show file type icon from attachment
<img src="/_icon_by_extension/<?= $attachment->file_extension ?>">
JavaScript Usage:
// After upload, show file type icon
const extension = response.attachment.file_extension;
$('#icon').attr('src', `/_icon_by_extension/${extension}?width=100&height=100`);
Icon Caching:
Icons are cached by extension and dimensions, allowing efficient
deduplication across all files of the same type.
SECURITY ARCHITECTURE
The file attachment system implements a cascading security model using
event hooks to provide flexible, layered access control.
Security Hook Types:
1. file.upload.authorize - Controls who can upload files
2. file.thumbnail.authorize - Controls who can view thumbnails/files
3. file.download.authorize - Additional restrictions for downloads
Cascading Model:
Thumbnails:
/_thumbnail/:key checks:
- file.thumbnail.authorize ✓
Downloads:
/_download/:key and /_inline/:key check:
- file.thumbnail.authorize ✓
- file.download.authorize ✓
Benefits:
- Implement thumbnail security once, downloads are automatically protected
- Add stricter download rules without affecting thumbnail viewing
- Impossible to forget download security if thumbnail security exists
- Thumbnails can be public while downloads require authentication
Common Patterns:
Public Thumbnails, Authenticated Downloads:
#[OnEvent('file.thumbnail.authorize', priority: 10)]
public static function allow_thumbnails($data) {
// Allow anyone to view thumbnails
return true;
}
#[OnEvent('file.download.authorize', priority: 10)]
public static function require_auth_for_downloads($data) {
if (!Session::is_logged_in()) {
return response()->json([
'success' => false,
'error' => 'Login required to download files'
], 403);
}
return true;
}
Private Files (Thumbnails and Downloads):
#[OnEvent('file.thumbnail.authorize', priority: 10)]
public static function restrict_all_access($data) {
// Check ownership or permissions
if (!can_access_file($data['attachment'], $data['user'])) {
return response()->json([
'success' => false,
'error' => 'Access denied'
], 403);
}
return true;
}
// No need for download hook - thumbnail security covers it
Premium Content:
#[OnEvent('file.thumbnail.authorize', priority: 10)]
public static function allow_previews($data) {
// Allow anyone to view thumbnails (previews)
return true;
}
#[OnEvent('file.download.authorize', priority: 10)]
public static function require_premium($data) {
// Require premium subscription for full downloads
if (!$data['user']?->has_premium_subscription()) {
return response()->json([
'success' => false,
'error' => 'Premium subscription required'
], 403);
}
return true;
}
Event Hook Data:
All download/view hooks receive:
- attachment: File_Attachment_Model instance
- user: Currently authenticated user (or null)
- request: Illuminate\Http\Request instance
Access attachment metadata:
$data['attachment']->fileable_type
$data['attachment']->fileable_id
$data['attachment']->fileable_category
$data['attachment']->file_type_id
$data['attachment']->created_by
See Also:
php artisan rsx:man event_hooks
SEARCH INDEX SYSTEM
Search_Index_Model (Framework Model)
Location: /system/app/Models/Search_Index_Model.php
Table: search_indexes
Purpose: Full-text searchable content extracted from files and other sources
Fields:
id Primary key
indexable_type Polymorphic: source model class name
indexable_id Polymorphic: source model ID
content LONGTEXT with FULLTEXT index
metadata JSON: extracted metadata (author, title, page count, etc.)
indexed_at When content was extracted (NULL = stale/needs reindex)
extraction_method Method used (e.g., 'pdftotext', 'tesseract_ocr', 'docx_parser')
language Language code for stemming (e.g., 'en', 'es', 'fr')
site_id Multi-tenant site ID
created_at First indexing timestamp
updated_at Last update timestamp
created_by User who triggered indexing
updated_by User who last updated
Relationships:
- indexable() - MorphTo any model (primary: File_Attachment_Model)
Primary Use Case: File Content Indexing
While this model is polymorphic and can index any content type (blog posts,
comments, pages, etc.), its primary purpose is indexing file content:
- PDF documents → extract text from pages
- Word documents → extract text content
- Images → OCR extracted text
- Excel sheets → extract cell data
- PowerPoint → extract slide text
Why Polymorphic:
Future flexibility for indexing non-file content without creating
duplicate search infrastructure. A single FULLTEXT index across all
searchable content enables powerful cross-content search.
Text Extraction is Expensive:
Parsing PDFs, running OCR, extracting from Office documents is
computationally expensive. This model caches extracted content to avoid
re-processing files on every search.
Stale Index Detection:
indexed_at NULL indicates content needs re-extraction:
- File was updated/replaced
- Extraction method improved (new parser version)
- Initial extraction failed
Search Capabilities:
// Full-text search across all content
Search_Index_Model::search('+urgent +report', 'BOOLEAN')
->where('site_id', $site_id)
->get();
// Search specific model type
Search_Index_Model::search('contract', 'NATURAL LANGUAGE')
->where('indexable_type', 'File_Attachment_Model')
->get();
// Filter by language
Search_Index_Model::search('documento')
->byLanguage('es')
->get();
Metadata Usage:
Extracted metadata enables faceted search and filtering:
- Document author
- Page count
- Creation date (from file metadata, not upload date)
- EXIF data from images (camera, location, timestamp)
- Office document properties
Example:
$index->set_metadata([
'author' => 'John Doe',
'pages' => 42,
'created_date' => '2024-03-15',
'keywords' => ['contract', 'legal', 'confidential']
]);
Index Management:
// Create or update index
$index = Search_Index_Model::find_or_create_for_model(
'File_Attachment_Model',
$attachment->id
);
$index->content = $extracted_text;
$index->extraction_method = 'pdftotext';
$index->language = 'en';
$index->indexed_at = now();
$index->save();
// Mark index as stale (needs re-indexing)
$index->mark_stale();
// Check if needs re-indexing
if ($index->is_stale()) {
// Re-extract content
}
TODO: Future Documentation
Search indexing will be moved to its own man page (search_index.txt)
when non-file indexing features are implemented. For now, it's
documented here as part of the file upload system since that's its
primary use case.
THUMBNAIL SYSTEM
File_Thumbnail_Model (Framework Model)
Location: /system/app/Models/File_Thumbnail_Model.php
Table: file_thumbnails
Purpose: Cache generated thumbnails with deduplication
Fields:
id Primary key
source_storage_id FK to file_storage (original file)
thumbnail_storage_id FK to file_storage (thumbnail bytes)
params JSON: {width, height, crop, format, quality}
detected_mime_type Actual MIME type from file content
created_at Generation timestamp
updated_at Last accessed timestamp
created_by User who first requested this thumbnail
updated_by User who last requested this thumbnail
Thumbnail Deduplication:
Thumbnails are cached and deduplicated based on three factors:
- Source file hash (File_Storage_Model.hash)
- Thumbnail parameters (width, height, crop, format, quality)
- Detected MIME type (from actual file content, not user-provided)
Thumbnail Cache Key Formula:
SHA-256(source_hash + params_json + detected_mime_type)
This ensures that:
- Same source file + same params = same thumbnail (deduplicated)
- Different users uploading identical files share thumbnails
- MIME type is based on actual content, not filename
- Cache keys are deterministic and collision-resistant
Example:
User A uploads photo.jpg (hash: abc123..., detected: image/jpeg)
User B uploads photo.png (same bytes, hash: abc123..., detected: image/jpeg)
Request: /files/{key}?thumb=200x200&crop=center
Both generate the same thumbnail cache key:
SHA-256("abc123..." + '{"width":200,"height":200,"crop":"center"}' + "image/jpeg")
Result: One physical thumbnail file shared by both users
Thumbnail Generation Flow:
1. Client requests: /files/{attachment_key}?thumb=200x200&crop=center
2. Load File_Attachment_Model by key
3. Load File_Storage_Model via relationship
4. Detect actual MIME type from physical file on disk
5. Calculate thumbnail cache key
6. Check File_Thumbnail_Model for existing thumbnail
- If found: serve cached thumbnail from thumbnail_storage_id
- If not found: generate thumbnail, create File_Storage_Model
for thumbnail bytes, create File_Thumbnail_Model record
7. Return thumbnail to client
Why Detect MIME Type:
Users can upload the same file with different extensions:
- photo.jpg (JPEG data)
- photo.png (same JPEG data, wrong extension)
The system detects actual content type (image/jpeg) from file headers,
ensuring both files share the same thumbnail regardless of filename.
Thumbnail as Derivative Artifact:
Thumbnails are NOT File_Attachment_Model records because:
- They are generated artifacts, not user uploads
- They are tied to source file + params, not user metadata
- They should be deleted when source is deleted (cascade)
- They don't need polymorphic relationships or site-scoping
Cleanup:
When a File_Storage_Model is deleted:
- All File_Thumbnail_Model records cascade delete (foreign key)
- Orphaned thumbnail File_Storage_Model records can be cleaned up
- This is automatic via database foreign key constraints
CLI COMMANDS
User Commands (rsx:file:*)
These commands are always visible and handle file attachment operations.
All operations use file locks to prevent race conditions.
rsx:file:upload
Upload a file from disk and create a File_Attachment_Model.
Usage:
rsx:file:upload /path/to/file.pdf --name="Report" --site=1
Required Options:
--site=ID Site ID for multi-tenant scoping
Optional Options:
--name="Name" Override filename (default: actual filename)
--category="cat" Set fileable_category
--type-meta="meta" Set fileable_type_meta
--meta='{"key":"value"}' Set fileable_meta JSON
Attachment Options:
--model=Model:ID Attach to model (e.g., User_Model:42)
Returns:
File attachment key for accessing the file
Examples:
# Upload file for specific site
rsx:file:upload /tmp/document.pdf --site=1
# Upload and attach to user profile
rsx:file:upload /tmp/avatar.jpg --site=1 \
--model=User_Model:42 \
--category="avatar" \
--type-meta="profile"
# Upload with metadata
rsx:file:upload /tmp/report.pdf --site=1 \
--meta='{"version":"2.1","department":"Sales"}'
rsx:file:info
Display detailed information about a file attachment.
Usage:
rsx:file:info {key}
Arguments:
key 64-char hex attachment key
Displays:
- File name and extension
- File type and size
- Storage hash
- Attachment metadata
- Polymorphic relationship (if attached)
- Site and user information
- Created/updated timestamps
Example:
rsx:file:info abc123def456...
rsx:file:list
List file attachments with filtering options.
Usage:
rsx:file:list [options]
Required Options (at least one):
--site=ID List files for site
--user=ID List files created by user
--model=Model:ID List files attached to model
Filter Options:
--category="cat" Filter by fileable_category
--type-meta="meta" Filter by fileable_type_meta
--type=image Filter by file_type (image, video, document, etc.)
Format Options:
--format=table Display format (table, json, csv)
--limit=50 Limit results (default: 50)
Examples:
# List all files for site
rsx:file:list --site=1
# List user's profile images
rsx:file:list --user=42 --category="avatar"
# List project documents in JSON
rsx:file:list --model=Project_Model:10 --format=json
rsx:file:attach
Attach an existing file to a different model or update attachment metadata.
Usage:
rsx:file:attach {key} --model=Model:ID [options]
Arguments:
key 64-char hex attachment key
Required Options:
--model=Model:ID Model to attach to
Optional Options:
--category="cat" Update fileable_category
--type-meta="meta" Update fileable_type_meta
--meta='{"key":"value"}' Update fileable_meta JSON
--order=1 Set fileable_order
Example:
# Attach file to project
rsx:file:attach abc123... --model=Project_Model:5 \
--category="document" \
--order=1
rsx:file:detach
Detach a file from its parent model (sets fileable_type/id to NULL).
Usage:
rsx:file:detach {key}
Arguments:
key 64-char hex attachment key
Warning:
Detached files become orphans and may be deleted by cleanup operations
after 24 hours if not re-attached to a model.
Example:
rsx:file:detach abc123def456...
rsx:file:delete
Delete a file attachment and cleanup orphaned storage.
Usage:
rsx:file:delete {key} [--force]
Arguments:
key 64-char hex attachment key
Options:
--force Skip confirmation prompt
Behavior:
1. Deletes File_Attachment_Model record
2. If no other attachments reference the same File_Storage_Model:
- Deletes physical file from disk
- Deletes File_Storage_Model record
3. If other attachments exist, storage is preserved
Example:
rsx:file:delete abc123def456...
Storage Commands (rsx:storage:*)
These commands are hidden by default and shown only when
IS_FRAMEWORK_DEVELOPER=true in .env. They provide low-level
access to the physical file storage system.
rsx:storage:info
Display detailed information about physical file storage.
Usage:
rsx:storage:info {hash}
Arguments:
hash SHA-256 hash or incremented variant
Displays:
- Storage hash
- File size (bytes and human-readable)
- Storage path on disk
- Physical file existence
- Reference count (number of attachments)
- Created/updated timestamps
- Creator/updater user IDs
Example:
rsx:storage:info abc123def456...
rsx:storage:stats
Display statistics about the file storage system.
Usage:
rsx:storage:stats [--site=ID]
Options:
--site=ID Filter stats for specific site
Displays:
- Total physical files (File_Storage_Model count)
- Total attachments (File_Attachment_Model count)
- Deduplication ratio (attachments / storage)
- Total disk usage (sum of file sizes)
- Average file size
- Files by type breakdown
- Orphaned storage count
- Orphaned attachments count (>24hrs, no fileable)
Example:
rsx:storage:stats
rsx:storage:stats --site=1
rsx:storage:cleanup
Force cleanup of orphaned storage and attachments.
Usage:
rsx:storage:cleanup [options]
Options:
--dry-run Show what would be deleted without deleting
--force Skip confirmation prompts
--orphan-age=24 Hours before orphaned attachment is deleted (default: 24)
Cleanup Operations:
1. Delete orphaned attachments (no fileable, older than orphan-age)
2. Delete orphaned storage (no attachments)
3. Delete physical files for deleted storage
Example:
# Preview cleanup operations
rsx:storage:cleanup --dry-run
# Force cleanup of 48-hour orphans
rsx:storage:cleanup --force --orphan-age=48
Locking Strategy:
All file operations acquire a write lock using RsxLocks to prevent
race conditions:
RsxLocks::get_lock(
RsxLocks::SERVER_LOCK,
RsxLocks::LOCK_FILE_WRITE,
RsxLocks::WRITE_LOCK,
30
)
This ensures:
- No concurrent file uploads cause hash collisions
- Storage cleanup doesn't delete files mid-upload
- Attachment operations are atomic
UPLOAD METHODS
File_Attachment_Model provides factory methods for creating attachments
from various sources. All methods handle storage creation, deduplication,
and metadata automatically.
create_from_upload($file, $params)
Primary method for handling HTTP file uploads.
Parameters:
$file - Illuminate\Http\UploadedFile instance
$params - Array with:
site_id (required) - Site ID
fileable_type - Model class to attach to
fileable_id - Model ID to attach to
fileable_category - Category (e.g., 'avatar')
fileable_type_meta - Searchable metadata
fileable_meta - Array of additional metadata
fileable_order - Sort order
filename_override - Override original filename
Example:
$attachment = File_Attachment_Model::create_from_upload(
$request->file('upload'),
[
'site_id' => $site->id,
'fileable_type' => 'User_Model',
'fileable_id' => $user->id,
'fileable_category' => 'avatar'
]
);
create_from_disk($path, $params)
Create attachment from file already on disk.
Useful for importing files or processing temporary files.
Parameters:
$path - Absolute path to file on disk
$params - Same as create_from_upload() plus:
filename - Filename to use (defaults to basename)
Example:
$attachment = File_Attachment_Model::create_from_disk(
'/tmp/import/document.pdf',
[
'site_id' => $site->id,
'filename' => 'imported-document.pdf',
'fileable_category' => 'import'
]
);
create_from_string($content, $filename, $params)
Create attachment from string content.
Useful for generated files (exports, reports, rendered images).
Parameters:
$content - File content as string
$filename - Filename (must include extension)
$params - Same as create_from_upload()
Example:
$csv = "Name,Email\nJohn,john@example.com";
$attachment = File_Attachment_Model::create_from_string(
$csv,
'export.csv',
[
'site_id' => $site->id,
'fileable_type' => 'Report_Model',
'fileable_id' => $report->id,
'fileable_category' => 'export'
]
);
create_from_url($url, $params)
Download file from URL and create attachment.
Useful for importing external files.
Parameters:
$url - URL to download file from
$params - Same as create_from_upload() plus:
filename - Filename to use (defaults to URL basename)
timeout - Download timeout in seconds (default: 30)
Example:
$attachment = File_Attachment_Model::create_from_url(
'https://example.com/logo.png',
[
'site_id' => $site->id,
'fileable_category' => 'logo'
]
);
All Methods:
- Automatically create or reuse File_Storage_Model (deduplication)
- Auto-detect MIME type and file type
- Generate cryptographically secure 64-char key
- Handle temporary file cleanup
- Thread-safe (File_Storage_Model uses locking internally)
FUTURE DEVELOPMENT
Completed Features:
✓ Image thumbnail generation with Imagick (cover and fit modes)
✓ Icon-based thumbnail fallback for non-image files
✓ File type icon system (80+ types)
✓ Cascading security hooks for downloads and thumbnails
✓ WebP output format with quality control
✓ Session-based attachment security with can_user_assign_this_file()
✓ Attachment API (attach_to, add_to, detach)
✓ Model helper methods (get_attachment, get_attachments)
Planned Enhancements:
Periodic Cleanup System:
- Scheduled task to remove unattached files older than 24 hours
- Configurable retention period for orphaned attachments
- Cleanup of orphaned File_Storage_Model records
- Removal of physical files with no storage references
- Dry-run mode for testing cleanup operations
- Metrics and reporting for cleaned files
Implementation:
- Artisan command: rsx:file:cleanup --dry-run
- Scheduled via Laravel task scheduler
- Runs daily at off-peak hours
- Logs cleanup operations for auditing
Cleanup Rules:
- Unattached files (fileable_type/id NULL) > 24hrs → deleted
- Storage with no attachments → physical file deleted
- Session validation prevents premature deletion
Additional Thumbnail Generators:
- PDF page rendering (first page or specific page)
- PSD layer previews
- DOCX/Office document previews
- Video frame extraction
- Audio waveform visualization
Thumbnail Caching:
- Database-backed thumbnail cache (File_Thumbnail_Model)
- Deduplication across identical source files
- Icon-based thumbnail caching by extension+dimensions
- Cache pre-warming for common sizes
- Automatic cleanup of unused thumbnails
JQHTML Upload Widgets:
- Drag-and-drop file uploader component
- Multi-file upload queue with progress
- Image cropper/editor widget
- Camera capture widget
- URL import widget
Advanced Thumbnail Features:
- Smart cropping (face detection, subject detection)
- Additional crop positions (top, bottom, left, right)
- Animated GIF/WebP support
- Custom watermarking
- Image filters and transformations
EXAMPLES
Basic Upload Flow:
// Receive uploaded file
$uploaded = $request->file('upload');
// Create physical storage
$storage = File_Storage_Model::find_or_create(
$uploaded->getPathname()
);
// Create logical attachment
$attachment = new File_Attachment_Model();
$attachment->file_storage_id = $storage->id;
$attachment->file_name = $uploaded->getClientOriginalName();
$attachment->file_extension = pathinfo($name, PATHINFO_EXTENSION);
$attachment->file_type_id = File_Attachment_Model::determine_file_type(
$uploaded->getMimeType()
);
$attachment->key = File_Attachment_Model::generate_key();
$attachment->save();
Attach to Model:
$attachment->fileable_type = 'User_Model';
$attachment->fileable_id = $user->id;
$attachment->fileable_category = 'profile_photo';
$attachment->save();
Retrieve and Display:
$photo = File_Attachment_Model::where('key', $key)->first();
$url = $photo->get_url();
$size = $photo->get_human_size();
$type = $photo->file_type_id_label;
SECURITY
File Access:
Files are accessed via cryptographically secure 64-character keys,
making enumeration attacks infeasible.
Permission Checking:
Routes serving files should implement permission checks based on
the fileable relationship to ensure users can only access files
they own or have permission to view.
SEE ALSO
model.txt - Model system documentation
storage_directories.txt - Storage directory conventions
migrations.txt - Database migration system
VERSION
RSpade Framework 1.0
Last Updated: 2025-11-04