In this paper we study the complexity of the model-checking problem for the tree logic introduced as the basis for the query language TQL [Cardelli, Ghelli 01: Query Language Based on the Ambient Logic]. We define two distinct fragments of this logic: TL containing only spatial connectives and TLe containing spatial connectives and quantification. We show that the combined complexity of TL is PSPACE-hard. We also study data complexity of model-checking and show that it is linear for TL, hard for all levels of the polynomial hierarchy for TLe and PSPACE-hard for the full logic. Finally we devise a polynomial space model-checking algorithm showing this way that the model-checking problem for the TQL logic is PSPACE-complete.

Automata for unranked trees form a foundation for XML schemas, querying and pattern languages. We study the problem of efficiently minimizing such automata. We start with the unranked tree automata (UTAs) that are standard in database theory, assuming bottom-up determinism and that horizontal recursion is represented by deterministic finite automata. We show that minimal UTAs in that class are not unique and that minimization is NP-hard. We then study more recent automata classes that do allow for polynomial time minimization. Among those, we show that bottom-up deterministic stepwise tree automata yield the most succinct representations.

N-ary queries in trees select sets of n-tuples of nodes. We propose and investigate representation formalisms for n-ary queries by tree automata, both for ranked and unranked trees. We show that existential run-based queries capture MSO in the \nary case as well as universal run-based queries. We then investigate queries by unambiguous tree automata that are relevant for query induction. We characterize queries by unambiguous automata by a natural fragment of MSO, show how to decide whether regular queries are definable in that fragment, and how to answer them efficiently in linear time.

We introduce a new concurrent lambda calculus with futures, lambda(fut), to model the operational semantics of Alice, a concurrent extension of ML. lambda(fut) is a minimalist extension of the call-by-value lambda-calculus that yields the full expressiveness to define, combine, and implement a variety of standard concurrency constructs such as channels, semaphores, and ports. We present a linear type system for lambda(fut) by which the safety of such definitions and their combinations can be proved: Well-typed implementations cannot be corrupted in any well-typed context.

Context unification (CU) is the famous open problem of solving context equations for trees. We distinguish a new decidable fragment of CU - well-nested CU - and present a new unification algorithm that solves well-nested context equations in non-deterministic polynomial time. We show that minimal well-nested solutions of context equations can be composed from the material present in the equation. This surprising property is highly wishful when modeling natural language ellipsis in CU.

In this paper, we study semistructured data and indexes preserving inclusion constraints. A semistructured datum is modelled by multi-rooted edge-labeled directed graphs. We consider regular path queries and inclusion constraints other these data. These constraints are binary relations over regular path expressions q and r, and are interpreted on a datum as ``for this datum, the answer to query q is included in the answer to query r''. We study how to represent inclusion constraints that are common to several data. Our work is based on two existing indexes: dataguide and 1-index. Given a set of data S, we extract from the dataguide of S a finite set C(S) of (finite) inclusion constraints such that an inclusion constraint is satisfied by a datum of S if and only if it is implied by C(S). We use 1-index which are covering indexes preserving inclusion constraints. Experiments compare the different ways of using the 1-index to index a set of data.

We consider semistructured data as rooted edge-labeled directed graphs, and path inclusion constraints on these graphs. In this paper, we give a new decision algorithm for the implication problem of a constraint $p \preceq q$ by a set of constraints $p_i \preceq u_i$ where $p$, $q$, and the $p_i$'s are regular path expressions and the $u_i$'s are non-empty paths, improving in this particular case, the more general algorithms of S. Abiteboul and V. Vianu, and Alechina et al. Moreover, in the case of a set of word equalities $u_i \equiv v_i$, we give a more efficient decision algorithm for the implication of a word equality $u \equiv v$, improving the more general algorithm of P. Buneman et al., and we prove that, in this case, the implication problem for non-deterministic models or for (complete) deterministic models are equivalent.

The constraint language for lambda structures (CLLS) is a description language for lambda terms. CLLS provides parallelism constraints to talk about the tree structure of lambda terms, and lambda binding constraints to specify variable binding. Parallelism constraints alone have the same expressiveness as context unification. In this paper, we show that lambda binding constraints can also be expressed in context unification when permitting tree regular constraints.

In this paper we investigate the quantifier-free fragment of the TQL logic proposed by Cardelli and Ghelli. The TQL logic, inspired from the ambient logic, is the core of a query language for semistructured data represented as unranked and unordered trees. The fragment we consider here, named STL, contains as main features spatial composition and location as well as a fixed point construct. We prove that satisfiability for STL is undecidable.We show also that STL is strictly more expressive that the Presburger monadic second-order logic (PMSO) of Seidl, Schwentick and Muscholl when interpreted over unranked and unordered edge-labeled trees. We define a class of tree automata whose transitions are conditioned by arithmetical constraints; we show then how to compute from a closed STL formula a tree automaton accepting precisely the models of the formula. Finally, still using our tree automata framework, we exhibit some syntactic restrictions over STL formulae that allow us to capture precisely the logics MSO and PMSO.

Subtype satisfiability is an important problem for designing advanced subtype systems and subtype-based program analysis algorithms. The problem is well understood if the atomic types form a lattice. However, little is known about subtype satisfiability over posets. In this paper, we investigate algorithms for and the complexity of subtype satisfiability over general posets. We present a uniform treatment of different flavors of subtyping: simple versus recursive types and structural versus non-structural subtype orders. Our results are established through a new connection of subtype constraints and modal logic. As a consequence, we settle a problem left open by Tiuryn and Wand in 1993.

In this paper, we consider the monadic second order logic (MSO) and two of its extensions, namely Counting MSO (CMSO) and Presburger MSO (PMSO), interpreted over unranked and unordered trees. We survey classes of tree automata introduced for the logics PMSO and CMSO as well as other related formalisms; we gather results from the literature and sometimes clarify or fill the remaining gaps between those various formalisms. Finally, we complete our study by adapting these classes of automata for capturing precisely the expressiveness of the logic MSO.

We investigate the first-order theory of subtyping constraints. We show that the first-order theory of non-structural subtyping is undecidable, and we show that in the case where all constructors are either unary or nullary, the first-order theory is decidable for both structural and non-structural subtyping. Our results hold for both simple and recursive types. The undecidability result is shown by a reduction from the Post's Correspondence Problem, and the decidability results are shown by a reduction to a decision problem on tree automata. In addition, we introduce the notion of a constrained tree automaton to express non-structural subtype entailment. This work is a step towards resolving long-standing open problems of the decidability of entailment for non-structural subtyping.

We present a new learning process for Web information extraction, that is visually interactive. Web documents are considered as trees in which wrappers select sets of nodes. We start from a recent algorithm that induces node selection queries in trees by methods of grammatical inference. We then show how to turn this algorithm into our visually interactive learning process, using intelligent tree pruning heuristics. Experiments on realistic Web documents confirm excellent quality with very few user interactions -- annotations and corrections -- during wrapper induction.