Namespaces
Variants
Views
Actions

std::mbtowc

From cppreference.com
< cpp‎ | string‎ | multibyte
Revision as of 15:58, 2 November 2012 by P12bot (Talk | contribs)

Defined in header <cstdlib>
int mbtowc( wchar_t* pwc, const char* s, std::size_t n )

Converts a multibyte character whose first byte is pointed to by s to a wide character, written to *pwc if pwc is not null.

If s is a null pointer, resets the global conversion state and determines whether shift sequences are used.

Contents

Parameters

s - pointer to the multibyte character
n - limit on the number of bytes in s that can be examined
pwc - pointer to the wide character for output

Return value

If s is not a null pointer, returns the number of bytes that are contained in the multibyte character or -1 if the first bytes pointed to by s do not form a valid multibyte character or 0 if s is pointing at the null charcter '\0'.

If s is a null pointer, resets its internal conversion state to represent the initial shift state and returns 0 if the current multibyte encoding is not state-dependent (does not use shift sequences) or a non-zero value if the current multibyte encoding is state-dependent (uses shift sequences).

Notes

Each call to mbtowc updates the internal global conversion state (a static object of type std::mbstate_t, only known to this function). If the multibyte encoding uses shift states, care must be taken to avoid backtracking or multiple scans. In any case, multiple threads should not call mbtowc without synchronization: std::mbrtowc may be used instead.

Example

#include <iostream>
#include <clocale>
#include <cstring>
#include <cstdlib>
 
int print_mb(const char* ptr)
{
    std::mbtowc(NULL, 0, 0); // reset the conversion state
    const char* end = ptr + std::strlen(ptr);
    int ret;
    for (wchar_t wc; (ret = std::mbtowc(&wc, ptr, end-ptr)) > 0; ptr+=ret) {
        std::wcout << wc;
    }
    std::wcout << '\n';
    return ret;
}
 
int main()
{
    std::setlocale(LC_ALL, "en_US.utf8");
    // UTF-8 narrow multibyte encoding
    const char* str = u8"z\u00df\u6c34\U0001d10b"; // or u8"zß水𝄋"
                      // or "\x7a\xc3\x9f\xe6\xb0\xb4\xf0\x9d\x84\x8b";
    print_mb(str);
}

Output:

zß水𝄋

See also

Template:cpp/string/multibyte/dcl list mbrtowcTemplate:cpp/string/multibyte/dcl list mblenTemplate:cpp/locale/codecvt/dcl list do in
C documentation for mbtowc