How to Enable and Configure the SCWS PHP Module in ServBay
ServBay is a powerful local web development environment for macOS and Windows, integrating runtimes for PHP, Node.js, Python, Go, Java, and more, as well as databases like MySQL, PostgreSQL, MongoDB, and Redis. It also supports web servers such as Caddy and Nginx. For developers needing efficient Chinese text processing in PHP applications, ServBay comes preinstalled with the SCWS (Simple Chinese Word Segmentation) module — a high-performance and accurate Chinese word splitter that’s easy to enable.
This guide walks you through enabling the SCWS PHP extension, configuring its dictionary files, and demonstrates basic usage with example code.
SCWS Module Overview
SCWS is an open-source Chinese text segmentation engine renowned for its speed and accuracy. By combining dictionary matching with statistical models, it quickly and accurately segments Chinese text, making it ideal for building Chinese search engines, text mining, content analysis, keyword extraction, and part-of-speech tagging.
Key Features
- High-Performance Segmentation: SCWS uses optimized algorithms to efficiently process large volumes of Chinese text.
- High Accuracy: By leveraging both dictionaries and statistical models, SCWS achieves excellent precision in segmentation tasks.
- Feature-Rich: Besides basic word splitting, it supports advanced features like keyword extraction and part-of-speech tagging.
- Easy Integration: Simple API interfaces make it straightforward to embed into PHP applications.
- Open Source & Free: SCWS is freely available and can be customized as needed.
Preinstalled SCWS Version in ServBay
ServBay supports multiple PHP versions and installs the corresponding SCWS module for each. As of this guide’s writing, ServBay includes SCWS 1.2.3 for PHP 5.6 to PHP 8.4.
How to Enable the SCWS Module
By default, SCWS is disabled in ServBay. You can enable it via two main methods: using the ServBay graphical interface or manually editing configuration files.
Recommended: Enable via ServBay GUI
This is the simplest and fastest method:
- Open the ServBay main interface.
- On the left navigation bar, click Languages and then select PHP.
- In the PHP version list on the right, find the version you wish to enable SCWS for (e.g.,
PHP 8.4
). - Click the Extensions button next to that PHP version.
- In the pop-up extensions list, locate the
SCWS
module. - Flip the switch on the left of
SCWS
to enable it (it will usually turn green). - Click the Save button at the bottom of the window.
- ServBay will prompt you to restart the PHP package to apply changes. Click the Restart button.
After completing these steps, SCWS is now enabled for your selected PHP version.
Manual Configuration File Editing (For Advanced Users or Troubleshooting)
If you need finer control or wish to troubleshoot, you can also edit the PHP configuration directly:
Locate the Configuration File: First, find the
conf.d
directory for your specific PHP version. The SCWS settings are in thescws.ini
file inside that directory. The typical path format is:/Applications/ServBay/etc/php/X.Y/conf.d/scws.ini
1Replace
X.Y
with your exact PHP version, e.g.,8.4
.Edit the
scws.ini
File: Openscws.ini
with a text editor. You'll see:ini[scws] ; Uncomment the following line to enable scws ;extension = scws.so ;scws.default.charset = gbk ;scws.default.fpath = /Applications/ServBay/etc/scws
1
2
3
4
5Remove the leading
;
fromextension = scws.so
to activate:ini[scws] ; Uncomment the following line to enable scws extension = scws.so ;scws.default.charset = gbk ;scws.default.fpath = /Applications/ServBay/etc/scws
1
2
3
4
5(Optional) You may also set default charset and dictionary path here, but it's usually better to set these dynamically in your PHP code for flexibility. If you set them here, remove the leading semicolon and adjust values as needed. For example, for UTF-8 dictionaries:
ini[scws] ; Uncomment the following line to enable scws extension = scws.so scws.default.charset = utf8 scws.default.fpath = /Applications/ServBay/etc/scws
1
2
3
4
5Save and close
scws.ini
.Restart the PHP Package: Open the ServBay main interface, go to Packages and locate your edited PHP version (e.g., PHP 8.4). Click its restart button (usually a round arrow icon).
How to Verify SCWS Module Is Loaded
After enabling the module, it’s important to confirm it loaded successfully. The easiest way is to check PHP’s phpinfo()
output:
- Within ServBay’s recommended site root,
/Applications/ServBay/www
, create a new subdirectory for testing, e.g.,scws-test
. - Inside
/Applications/ServBay/www/scws-test
, create a file calledphpinfo.php
. - Copy this PHP code into
phpinfo.php
:php<?php phpinfo(); ?>
1
2
3 - Ensure your ServBay web server (such as Caddy or Nginx) is properly configured and running, serving sites from
/Applications/ServBay/www
. By default, ServBay links the domainservbay.demo
to this directory. - In your browser, visit
https://servbay.demo/scws-test/phpinfo.php
. - On the resulting PHP info page, scroll to find the "SCWS" section. If you see SCWS-related settings (such as version number and options), the module has loaded correctly.
(Note: Image path is an example; please refer to your actual ServBay documentation for screenshots)
Creating and Configuring SCWS Dictionaries
As a dictionary-based segmentation engine, SCWS’s effectiveness depends largely on its dictionaries. ServBay provides default dictionaries and rule files in /Applications/ServBay/etc/scws
. You can also create or use your own dictionaries.
SCWS Dictionary File Formats
SCWS supports plain text and binary xdb dictionary formats. The xdb format is recommended for its speed and efficient memory use.
The plain text dictionary file format is one word per line, optionally followed by frequency (number; higher means more common) and part-of-speech:
word1 [frequency1] [POS1]
word2 [frequency2] [POS2]
...
1
2
3
2
3
Example:
Artificial Intelligence 1000 n
Natural Language Processing 800 n
ServBay 500 nz
1
2
3
2
3
Save your custom vocabulary as a text file, e.g., my_dict.txt
. Make sure the file encoding matches your chosen charset (UTF-8 recommended).
Generating xdb Dictionary Files
ServBay includes SCWS’s scws-gen-dict
tool for converting a text dictionary to xdb format.
- Open Terminal on macOS.
- Use
cd
to enter ServBay’s bin directory, or directly specify the path toscws-gen-dict
, usually:bashReplace/Applications/ServBay/bin/scws-gen-dict -i /path/to/your/my_dict.txt -o /Applications/ServBay/etc/scws/my_dict.utf8.xdb -c utf8
1/path/to/your/my_dict.txt
with your actual text dictionary path. The-o
parameter sets the output xdb file and path;/Applications/ServBay/etc/scws
is recommended.-c utf8
sets the input file encoding.
Configuring SCWS to Use Dictionary Files
After generating the xdb file, specify it in your PHP code:
php
<?php
$scws = scws_new();
$scws->set_charset('utf8'); // Set charset to match dictionary encoding
// Set main dictionary path: can be ServBay’s default or your xdb file
$scws->set_dict('/Applications/ServBay/etc/scws/dict.utf8.xdb');
// If you have multiple dictionaries, append them
$scws->add_dict('/Applications/ServBay/etc/scws/my_dict.utf8.xdb', SCWS_XDICT_TXT); // SCWS_XDICT_TXT means user dictionary
$scws->set_rule('/Applications/ServBay/etc/scws/rules.utf8.ini'); // Rule file for POS tagging etc.; ServBay provides this
// ... further segmentation ...
?>
1
2
3
4
5
6
7
8
9
10
11
2
3
4
5
6
7
8
9
10
11
set_dict()
sets the primary dictionary (usually the official, larger SCWS dictionary). add_dict()
appends your custom dictionary. SCWS_XDICT_TXT
signals that the dictionary is user-provided.
SCWS Usage Example
Once SCWS is enabled and the dictionary configured, you can use SCWS functions from your PHP code for word segmentation. Here’s a basic example:
php
<?php
// Verify SCWS extension is loaded
if (!extension_loaded('scws')) {
die('SCWS extension is not loaded.');
}
// Initialize SCWS object
$scws = scws_new();
if (!$scws) {
die('Failed to initialize SCWS.');
}
// Set charset (must match your text and dictionary encoding)
$scws->set_charset('utf8');
// Set dictionary file path (ServBay default path)
// set_dict() for main dictionary
$scws->set_dict('/Applications/ServBay/etc/scws/dict.utf8.xdb');
// add_dict() to append user custom dictionary
// $scws->add_dict('/Applications/ServBay/etc/scws/my_dict.utf8.xdb', SCWS_XDICT_TXT);
// Set rule file path (ServBay default), used for POS tagging etc.
$scws->set_rule('/Applications/ServBay/etc/scws/rules.utf8.ini');
// Set segmentation mode (optional; default is SCWS_XDICT_XPINYIN | SCWS_XDICT_DUALITY)
// SCWS_XDICT_XPINYIN: splits x characters (non-Chinese), e.g., email, URL, etc.
// SCWS_XDICT_DUALITY: compound (bigram) segmentation
// $scws->set_ignore(true); // Ignore punctuation
// $scws->set_multi(SCWS_MULTI_WORD | SCWS_MULTI_ZHONGCI); // Set multi-level segmentation
// The Chinese text to segment
$text = "ServBay 是一个强大的本地 Web 开发环境,支持 PHP、Node.js 和多种数据库。";
// Send text to SCWS for processing
$scws->send_text($text);
// Retrieve segmentation results
echo "Original Text: " . $text . "\n\n";
echo "Segmentation Results:\n";
// Loop through segmentation results
while ($result = $scws->get_result()) {
foreach ($result as $word) {
// $word is an associative array: 'word', 'idf', 'attr' (POS), etc.
echo "Word: " . $word['word'] . " (POS: " . $word['attr'] . ")\n";
}
}
// Release SCWS resources
$scws->close();
?>
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
Save this as a .php
file (e.g., scws_example.php
) and place it in ServBay’s web directory (such as /Applications/ServBay/www/scws-test/
). Visit https://servbay.demo/scws-test/scws_example.php
in your browser to see the segmentation results.
Tips & Notes
- Ensure your SCWS module version matches your PHP version. ServBay handles compatibility, but double-check when configuring manually.
- SCWS’s effectiveness is tightly linked to dictionary quality. For specialized fields, consider building or using domain-specific dictionaries.
- Make sure SCWS config file (
scws.ini
), dictionary files (.xdb
), and rule files (.ini
) have correct paths, and that PHP can read them. - Always restart the relevant PHP package after configuration changes for changes to take effect.
FAQ (Frequently Asked Questions)
Q: I enabled SCWS via ServBay UI but don’t see it in phpinfo()
?
A: Make sure you restarted the correct PHP package. Multiple PHP versions can run simultaneously; restart the one your site actually uses. If issues persist, try manually editing scws.ini
and double-check file paths and syntax.
Q: How do I create and use a custom dictionary?
A: Refer to the “Creating and Configuring SCWS Dictionaries” section above; use scws-gen-dict
to convert your text dictionary to xdb format, then load it via the add_dict()
method in your PHP code.
Q: What is the purpose of the SCWS rule file (rules.utf8.ini
)?
A: The rule file is mainly used for part-of-speech tagging and special segmentation rules. ServBay provides a default file — usually there's no need to modify it.
Conclusion
ServBay empowers developers to easily enable and manage the SCWS PHP Chinese word segmentation module. Whether through an intuitive graphical UI or flexible manual configuration, integrating SCWS into your PHP development workflow is straightforward. With ServBay’s preinstalled SCWS tools and default dictionaries, you can quickly get started and leverage SCWS’s efficiency and accuracy for Chinese text processing — enhancing your web applications (such as search or content analysis) with powerful Chinese language capabilities. As a key part of ServBay’s robust package ecosystem, SCWS integration further improves ServBay’s completeness and practicality as a local development platform.