forceutf8
PHP Class Encoding featuring popular Encoding::toUTF8() function --formerly known as forceUTF8()-- that fixes mixed encoded strings.
Top Related Projects
Pure golang image resizing
Symfony polyfill for the Mbstring extension
Quick Overview
ForceUTF8 is a PHP library designed to handle UTF-8 encoding issues. It provides a set of functions to detect and convert text encodings, ensuring that strings are properly encoded in UTF-8 format. This library is particularly useful for dealing with legacy systems or data sources that may use different character encodings.
Pros
- Simple and easy to use API for handling UTF-8 encoding
- Supports automatic detection and conversion of various character encodings
- Helps prevent encoding-related issues in PHP applications
- Lightweight and has no external dependencies
Cons
- Limited to PHP applications only
- May not handle all possible encoding scenarios
- Lacks extensive documentation and examples
- Not actively maintained (last update was in 2018)
Code Examples
- Basic usage to force UTF-8 encoding:
<?php
use ForceUTF8\Encoding;
$text = "Some text with non-UTF-8 characters";
$utf8_text = Encoding::toUTF8($text);
- Converting an array of strings to UTF-8:
<?php
use ForceUTF8\Encoding;
$array = array("item1" => "non-UTF-8 string", "item2" => "another non-UTF-8 string");
$utf8_array = Encoding::toUTF8($array);
- Fixing mixed encoding issues:
<?php
use ForceUTF8\Encoding;
$mixed_encoding_text = "Text with mixed encodings: UTF-8, ISO-8859-1, etc.";
$fixed_text = Encoding::fixUTF8($mixed_encoding_text);
Getting Started
To use ForceUTF8 in your PHP project, follow these steps:
-
Install the library using Composer:
composer require neitanod/forceutf8
-
Include the Composer autoloader in your PHP script:
<?php require 'vendor/autoload.php';
-
Use the library in your code:
<?php use ForceUTF8\Encoding; $text = "Your text here"; $utf8_text = Encoding::toUTF8($text);
Competitor Comparisons
Pure golang image resizing
Pros of resize
- Focused on image resizing functionality in Go
- Provides multiple interpolation algorithms for resizing
- Lightweight and efficient for image processing tasks
Cons of resize
- Limited to image resizing operations
- Requires knowledge of Go programming language
- May not be suitable for projects requiring broader image manipulation features
Code comparison
resize:
import "github.com/nfnt/resize"
// Resize using Lanczos resampling
m := resize.Resize(width, height, img, resize.Lanczos3)
forceutf8:
use ForceUTF8\Encoding;
$text = Encoding::toUTF8($text);
Key differences
- Purpose: resize focuses on image processing, while forceutf8 handles character encoding conversion
- Language: resize is written in Go, forceutf8 in PHP
- Functionality: resize offers image resizing algorithms, forceutf8 provides UTF-8 encoding conversion
- Use case: resize is ideal for projects requiring image manipulation, forceutf8 for text encoding issues
- Complexity: resize may require more technical knowledge for image processing, forceutf8 is simpler for encoding conversion
Conclusion
While both repositories serve different purposes, they each address specific needs in their respective domains. resize is more suitable for image-related projects in Go, while forceutf8 is better for handling character encoding issues in PHP applications.
Symfony polyfill for the Mbstring extension
Pros of polyfill-mbstring
- Part of the larger Symfony ecosystem, benefiting from its robust development and support
- Provides a more comprehensive set of multibyte string functions
- Actively maintained with regular updates and improvements
Cons of polyfill-mbstring
- Larger codebase, potentially increasing project size
- May include unnecessary functions for simpler use cases
- Slightly steeper learning curve due to more extensive API
Code Comparison
forceutf8:
$utf8_string = Encoding::toUTF8($string);
$ascii_string = Encoding::toASCII($string);
polyfill-mbstring:
$utf8_string = mb_convert_encoding($string, 'UTF-8');
$ascii_string = mb_convert_encoding($string, 'ASCII');
Summary
polyfill-mbstring offers a more comprehensive solution for multibyte string handling, especially within the Symfony ecosystem. It provides a wider range of functions and active maintenance. However, forceutf8 may be preferable for simpler projects due to its lightweight nature and straightforward API. The choice between the two depends on the specific project requirements and the developer's familiarity with each library.
Convert designs to code with AI
Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.
Try Visual CopilotREADME
forceutf8
PHP Class Encoding featuring popular \ForceUTF8\Encoding::toUTF8() function --formerly known as forceUTF8()-- that fixes mixed encoded strings.
Description
If you apply the PHP function utf8_encode() to an already-UTF8 string it will return a garbled UTF8 string.
This class addresses this issue and provides a handy static function called \ForceUTF8\Encoding::toUTF8().
You don't need to know what the encoding of your strings is. It can be Latin1 (ISO 8859-1), Windows-1252 or UTF8, or the string can have a mix of them. \ForceUTF8\Encoding::toUTF8() will convert everything to UTF8.
Sometimes you have to deal with services that are unreliable in terms of encoding, possibly mixing UTF8 and Latin1 in the same string.
Update:
I've included another function, \ForceUTF8\Encoding::fixUTF8(), which will fix the double (or multiple) encoded UTF8 string that looks garbled.
Usage:
use \ForceUTF8\Encoding;
$utf8_string = Encoding::toUTF8($utf8_or_latin1_or_mixed_string);
$latin1_string = Encoding::toLatin1($utf8_or_latin1_or_mixed_string);
also:
$utf8_string = Encoding::fixUTF8($garbled_utf8_string);
Examples:
use \ForceUTF8\Encoding;
echo Encoding::fixUTF8("FÃédération Camerounaise de Football\n");
echo Encoding::fixUTF8("Fédération Camerounaise de Football\n");
echo Encoding::fixUTF8("FÃédÃération Camerounaise de Football\n");
echo Encoding::fixUTF8("FÃÃÃÃédÃÃÃÃération Camerounaise de Football\n");
will output:
Fédération Camerounaise de Football
Fédération Camerounaise de Football
Fédération Camerounaise de Football
Fédération Camerounaise de Football
Options:
By default, Encoding::fixUTF8
will use the Encoding::WITHOUT_ICONV
flag, signalling that iconv should not be used to fix garbled UTF8 strings.
This class also provides options for iconv processing, such as Encoding::ICONV_TRANSLIT
and Encoding::ICONV_IGNORE
to enable these flags when the iconv class is utilized. The functionality of such flags are documented in the PHP iconv documentation.
Examples:
use \ForceUTF8\Encoding;
$str = "FÃédération CamerounaiseâdeâFootball\n"; // Uses U+2014 which is invalid ISO8859-1 but exists in Win1252
echo Encoding::fixUTF8($str); // Will break U+2014
echo Encoding::fixUTF8($str, Encoding::ICONV_IGNORE); // Will preserve U+2014
echo Encoding::fixUTF8($str, Encoding::ICONV_TRANSLIT); // Will preserve U+2014
will output:
Fédération Camerounaise?de?Football
Fédération CamerounaiseâdeâFootball
Fédération CamerounaiseâdeâFootball
while:
use \ForceUTF8\Encoding;
$str = "ÄÄÄįšųūž"; // Uses several characters not present in ISO8859-1 / Win1252
echo Encoding::fixUTF8($str); // Will break invalid characters
echo Encoding::fixUTF8($str, Encoding::ICONV_IGNORE); // Will remove invalid characters, keep those present in Win1252
echo Encoding::fixUTF8($str, Encoding::ICONV_TRANSLIT); // Will trasliterate invalid characters, keep those present in Win1252
will output:
????????
šž
ceeišuuž
Install via composer:
Edit your composer.json file to include the following:
{
"require": {
"neitanod/forceutf8": "~2.0"
}
}
Tips:
You can tip me with Bitcoin if you want. :)
Top Related Projects
Pure golang image resizing
Symfony polyfill for the Mbstring extension
Convert designs to code with AI
Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.
Try Visual Copilot