File Coverage

blib/lib/Text/Phonetic/Phonem.pm
Criterion Covered Total %
statement 19 19 100.0
branch 2 2 100.0
condition n/a
subroutine 4 4 100.0
pod n/a
total 25 25 100.0


line stmt bran cond sub pod time code
1             # ============================================================================
2             package Text::Phonetic::Phonem;
3             # ============================================================================
4 4     4   87974 use utf8;
  4         17  
  4         19  
5              
6 4     4   385 use Moo;
  4         7762  
  4         18  
7             extends qw(Text::Phonetic);
8              
9             our $VERSION = $Text::Phonetic::VERSION;
10              
11             our %DOUBLECHARS = (
12             SC =>'C',
13             SZ =>'C',
14             CZ =>'C',
15             TZ =>'C',
16             SZ =>'C',
17             TS =>'C',
18             KS =>'X',
19             PF =>'V',
20             QU =>'KW',
21             PH =>'V',
22             UE =>'Y',
23             AE =>'E',
24             OE =>'Ö',
25             EI =>'AY',
26             EY =>'AY',
27             EU =>'OY',
28             AU =>'A§',
29             OU =>'§ '
30             );
31              
32             sub _do_encode {
33 17     17   36 my ($self,$string) = @_;
34              
35 17         33 $string = uc($string);
36 17         29 $string =~ tr/A-Z//cd;
37              
38             # Iterate over two character substitutions
39 17         47 foreach my $index (0..((length $string)-2)) {
40 98 100       200 if ($DOUBLECHARS{substr $string,$index,2}) {
41 14         27 substr ($string,$index,2) = $DOUBLECHARS{substr $string,$index,2};
42             }
43             }
44              
45             # Single character substitutions via tr
46 4     4   2584 $string =~tr/ZKGQIJFWPT§/CCCCYYVBDUA/;
  4         8  
  4         56  
  17         54  
47              
48             #delete forbidden characters
49 17         33 $string =~tr/ABCDLMNORSUVWXY//cd;
50              
51             #remove double chars
52 17         20 $string =~tr/ABCDLMNORSUVWXY//s;
53              
54 17         53 return $string;
55             }
56              
57             =encoding utf8
58              
59             =pod
60              
61             =head1 NAME
62              
63             Text::Phonetic::Phonem - Phonem algorithm
64              
65             =head1 DESCRIPTION
66              
67             The PHONEM algorithm is a simple substitution algorithm that was originally
68             implemented in dBase.
69              
70             Implementation of the PHONEM substitutions, as described in Georg Wilde and
71             Carsten Meyer, "Doppelgaenger gesucht - Ein Programm fuer kontextsensitive
72             phonetische Textumwandlung" from ct Magazin fuer Computer & Technik 25/1999.
73              
74             The original variant was implemented as X86-Assembler-Funktion. This
75             implementation does not try to mimic the original code, though it should
76             achieve equal results. As the original software used for building the original
77             implementation was not available, there was no testing for correctness, other
78             than the examples given in the article.
79              
80             The Perl implementation was written by Martin Wilz
81             (L)
82              
83             =head1 AUTHOR
84              
85             Martin Wilz
86             http://wilz.de/view/Themen/MagisterArbeit
87              
88             Maroš Kollár
89             CPAN ID: MAROS
90             maros [at] k-1.com
91             http://www.k-1.com
92              
93             =head1 COPYRIGHT
94              
95             This program is free software; you can redistribute
96             it and/or modify it under the same terms as Perl itself.
97              
98             The full text of the license can be found in the
99             LICENSE file included with this module.
100              
101             =head1 SEE ALSO
102              
103              
104             =cut
105              
106             1;