Site Search

Search across the site

    Profile photo

    Haoxiang Sun

    I work on understanding how do language models turn data into knowledge, i.e.

    I. Data In: Accountable Source. If text in a corpus is, as Clark (1996) argues, a trace of joint action with the collaborative process left largely implicit, how can we understand what training data actually carries, and rethink what it means to learn from it responsibly?

    II. Knowledge Within: Structure. If knowledge is, as Tomasello (2000) argues, abstracted from discrete usages in their own context, how can we understand how knowledge organizes itself inside language models and rethink architectures that reflect this multiplicity?

    III. Tool-call Out: Native Agency. If language is, as Vygotsky (1978) argues, a tool through which knowledge is accessed and constructed, how can we understand the limits of compressing knowledge into static params. and rethink LMs as natively tool-using agents?

    This site is still under construction.

    Publications